Monday, 26 August 2013

Using R to reformat data from cross-tab to one-datum-per-line format

Using R to reformat data from cross-tab to one-datum-per-line format

I'm using R to pull in data through an API and merge all of it into a
single table, which I then write to a CSV file. To graph it properly in
Tableau, however, I need to prepare the data by using their reformatting
tool for Excel to get it from a cross-tablulated format to a format where
each line contains only once piece of data. For example, taking something
from the format:
ID,Gender,School,Math,English,Science
1,M,West,90,80,70
2,F,South,50,50,50
To:
ID,Gender,School,Subject,Score
1,M,West,Math,90
1,M,West,English,80
1,M,West,Science,70
2,F,South,Math,50
2,F,South,English,50
2,F,South,Science,50
Are there any existing tools in R or in an R library that would allow me
to do this, or that would provide a starting point? I am trying to
automate the preparation of data for Tableau so that I just need to run a
single script to get it formatted properly, and would like to remove the
manual Excel step if possible.

No comments:

Post a Comment