regex - extract comma separated strings -
i have data frame below. sample set data uniform looking patterns whole data not uniform:
locationid address 1073744023 525 east 68th street, new york, ny 10065, usa 1073744022 270 park avenue, new york, ny 10017, usa 1073744025 rockefeller center, 50 rockefeller plaza, new york, ny 10020, usa 1073744024 1251 avenue of americas, new york, ny 10020, usa 1073744021 1301 avenue of americas, new york, ny 10019, usa 1073744026 44 west 45th street, new york, ny 10036, usa
i need find city , country name address. tried following:
1) strsplit gives me list cannot access last or third last element this.
2) regular expressions finding country easy
str_sub(str_extract(address, "\\d{5},\\s.*"),8,11)
but city
str_sub(str_extract(address, ",\\s.+,\\s.+\\d{5}"),3,comma_pos)
i cannot find comma_pos
leads me same problem again. believe there more efficient way solve using of above approached.
split data
ss <- strsplit(data,",")`
then
n <- sapply(s,len)
will give number of elements (so can work backward). then
mapply(ss,"[[",n)
gives last element. or do
sapply(ss,tail,1)
to last element.
to second-to-last (or more generally) need
sapply(ss,function(x) tail(x,2)[1])
Comments
Post a Comment