Modify variable length strings in R -


i'm working data table first column ("first") contains strings (this subset):

  >first    #[1] "a10"    "a10r"   "a1112"  "a1112r" "a116"   "a116r"  "a1212"  "a1212r" "a126"   "a126r"  "a1312"  "a1312r" "a136"   "a136r"  "a20"    "a20r"    #[17] "a2112"  "a2112r" "a216"   "a216r"  "a2212"  "a2212r" "a226"   "a226r"  "a2312"  "a2312r" "a236"   "a236r"  "a30"    "a30r"   "a3112"  "a3112r" 

i trying final format contain 6 elements, adding specific elements @ different positions.

i used following commands first last: add "s" strings not containing "r":

 >middle1<-ifelse(!grepl("r",first),paste0(first,"s"),first)   #[1] "a10s"   "a10r"   "a1112s" "a1112r" "a116s"  "a116r"  "a1212s" "a1212r" "a126s"  "a126r"  "a1312s" "a1312r" "a136s"  "a136r"  "a20s"   "a20r"    #[17] "a2112s" "a2112r" "a216s"  "a216r"  "a2212s" "a2212r" "a226s"  "a226r"  "a2312s" "a2312r" "a236s"  "a236r"  "a30s"   "a30r"   "a3112s" "a3112r" 

and following command add digit "0" after first element, if there fewer 5 elements.

>middle2<-ifelse(nchar(middle1)<5,gsub('^(.{1})(.*)$','\\10\\2',middle1[nchar(middle1)<5]), middle1)  #[1] "a010s"  "a010r"  "a1112s" "a1112r" "a116s"  "a116r"  "a1212s" "a1212r" "a126s"  "a126r"  "a1312s" "a1312r" "a136s"  "a136r"  "c020s"  "c020r"  # [17] "a2112s" "a2112r" "a216s"  "a216r"  "a2212s" "a2212r" "a226s"  "a226r"  "a2312s" "a2312r" "a236s"  "a236r"  "b030s"  "b030r"  "a3112s" "a3112r" 

i repeated previous command, time adding digit "0" after third element, if there fewer 6 elements. brought me 6.

>last<-ifelse(nchar(middle2)<6,gsub('^(.{3})(.*)$','\\10\\2',middle2[nchar(middle2)<6]),middle2)   #[1] "a0100s" "a0100r" "a1112s" "a1112r" "a1206s" "a1206r" "a1212s" "a1212r" "c0200s" "c0200r" "a1312s" "a1312r" "a2206s" "a2206r" "a2306s" "a2306r"  #[17] "a2112s" "a2112r" "a3106s" "a3106r" "a2212s" "a2212r" "a3306s" "a3306r" "a2312s" "a2312r" "a4206s" "a4206r" "a4306s" "a4306r" "a3112s" "a3112r" 

however, problem encountering positions within vector have been moved around ("c0200s","c0200r" have changed positions). ultimately, need use these strings label rows, , need in original positions. i'm newbie, if question has been asked, obvious, or i've written incorrect, apologize in advance.

so question is: how modify strings in r without reordering of vector?

here simpler way implement logic. in addition, suggest doing sanity checks along way make sure logic robust.

first <- c("a10","a10r","a1112","a1112r","a116","a116r","a1212","a1212r","a126","a126r","a1312","a1312r","a136","a136r","a20","a20r",     "a2112","a2112r","a216","a216r","a2212","a2212r","a226","a226r","a2312","a2312r","a236","a236r","a30","a30r","a3112","a3112r")  library(stringi) unlist(lapply(stri_split_boundaries(first, type="character"), function(x) {      if (length(x) < 3) {         print(x)         stop("logic not apply correctly")     }      #add "s" strings not containing "r"     if (tail(x, 1) != "r") x <- c(x, "s")      if (length(x) < 4) {         print(x)         stop("logic not apply correctly")     }      #add digit "0" after first element, if there fewer 5 elements.     if (length(x) < 5) x <- c(x[1], "0", x[-1])      if (length(x) < 5) {         print(x)         stop("logic not apply correctly")     }      #adding digit "0" after third element, if there fewer 6 elements     if (length(x) < 6) x <- c(x[seq_len(3)], "0", x[-seq_len(3)])      if (length(x) != 6) {         print(x)         stop("check logic.")        }      paste(x, collapse="") })) 

Comments