TransWikia.com

Replace patterns separated by delimiter in R

Stack Overflow Asked on December 16, 2021

I need to remove values matching "CBII_*_*_" with "MAP_" in vector tt below.

tt <- c("CBII_27_1018_62770", "CBII_2733_101448_6272", "MAP_1222")

I tried

gsub("CBII_*_*", "MAP_") which won’t give the expected result. What would be the solution for this so I get:

"MAP_62770", "MAP_6272", "MAP_1222"

3 Answers

sub(".*(?<=_)(\d+)$", "MAP_\1", tt, perl = T)
[1] "MAP_62770" "MAP_6272"  "MAP_1222"

Here we use positive lookbehind to assert that there is an underscore _ on the left of the capturing group (\d+) at the very end of the string ($); we recall that capturing group with \1 in the replacement argument to sub and move MAP_in front of it.

Answered by Chris Ruehlemann on December 16, 2021

An option with trimws from base R along with paste. We specify the whitespace as characters (.*) till the _. Thus, it removes the substring till the last _ and then with paste concatenate a new string ("MAP_")

paste0("MAP_", trimws(tt, whitespace = ".*_"))
#[1] "MAP_62770" "MAP_6272"  "MAP_1222" 

Answered by akrun on December 16, 2021

You can use:

gsub("^CBII_.*_.*_", "MAP_",tt)

or

stringr::str_replace(tt, "^CBII_.*_.*_", "MAP_")

Output

[1] "MAP_62770" "MAP_6272"  "MAP_1222"

Answered by slava-kohut on December 16, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP