I use the gsub
function in R to remove unwanted characters in numbers. So I should remove from the strings every character that is not a number, .
, and -
. My problem is that the regular expression is not removing some non-numeric characters like d
, +
, and <
.
Below are my regular expression, the gsub
execution, and its output. How can I change the regular expression in order to achieve the desired output?
Current output:
gsub(pattern = '[^(-?(\\d*\\.)?\\d+)]', replacement = '', x = c('1.2<', '>4.5', '3+.2', '-1d0', '2aadddab2','1.3h'))
[1] "1.2<" ">4.5" "3+.2" "-1d0" "2ddd2" "1.3"
Desired output:
[1] "1.2" "4.5" "3.2" "-10" "22" "1.3"
Thank you.
Simply use
gsub("[^0-9.-]", "", x)
You can in case of multiple -
and .
have a second regEx dealing with that.
If you struggle with it, open a new question.
(Make sure to change .
with ,
if needed)