A fast way to merge named vectors of different length into a data frame (preserving name information as column name) in R -
i have list l of named vectors. example, 1st element:
> l[[1]] $event [1] "eventa" $time [1] "1416355303" $city [1] "los angeles" $region [1] "california" $locale [1] "en-gb"
when unlist
each element of list resulting vectors looks (for 1st 3 elements):
> unlist(l[[1]]) event time city region locale "eventa" "1416355303" "los angeles" "california" "en-gb" > unlist(l[[2]]) event time locale "eventb" "1416417567" "en-gb" > unlist(l[[3]]) event properties.time "eventm" "1416417569"
i have on 0.5 million elements in list , each 1 has 42 of these feaures/names. have merge them dataframe taken account names , not of them have same number of feaures or names (in example above, v2 has no information region
, city
). @ moment, loop through whole list:
df1 <- merge(stack(unlist(l[[1]])), stack(unlist(l[[2]])), = "ind", = true) suppresswarnings(for (i in 3:length(l)){ df1 <- merge(df1, stack(unlist(l[[i]])), = "ind", = true) }) df1 <- as.data.frame(t(df1))
for example above returns:
v1 v2 v3 v4 v5 ind city event locale region time values.x los angeles eventa en-gb california 1416355303 values.y <na> eventb en-gb <na> 1416417567 values <na> eventm <na> <na> 1416417569
which want. however, bearing in mind length of list , fact every time command:
df1 <- merge(df1, stack(unlist(l[[i]])), = "ind", = true)
runs, loads entire data frame (df1), loop takes long time. therefore, wondering if knows better/faster way code this. in other words. given long list of named vectors different lengths, there fast way merge them data frame 1 described above.
for example, there way of doing using foreach
, %dopar%
? in case, faster approach welcome.
i'm not sure why use merge
. seems me should rbind
.
l <- list(list(event = "eventa", time = 1416355303, city = "los angeles", region = "california", locale = "en-gb"), list(event = "eventb", time = 1416417567, locale = "en-gb"), list(event = "eventm", time = 1416417569)) library(plyr) do.call(rbind.fill, lapply(l, as.data.frame)) # event time city region locale #1 eventa 1416355303 los angeles california en-gb #2 eventb 1416417567 <na> <na> en-gb #3 eventm 1416417569 <na> <na> <na>
Comments
Post a Comment