2012년 12월 28일 금요일

File data handling - No7

Last few posts, I spent most of time explaining how to generate data and how to manipulate data by using different functions such as matrix, data frame or Vector.
However, In reality, we get a raw data from database or files. 

In this post, I will show you one of the the simplest ways to handle raw data file.
and I will cover database connection later.
I think, most of people are MS excel user. 
Probably, you also have various raw data which come from diverse data source.
Looked at from that point of view, collecting meaningful raw data might be the beginning of your data analysis.  

In this lesson, I will assume that we have a meaningful excel data requires further data analysis.




[ Excel data is just like this ] 

Tree age circumference
1 118 30
1 484 58
1 664 87
1 1004 115
1 1231 120
1 1372 142
1 1582 145
2 118 33
2 484 69
2 664 111
2 1004 156
2 1231 172
2 1372 203
2 1582 203
3 118 30
3 484 51
3 664 75
3 1004 108
3 1231 115
You can save your excel data into CSV file on your local computer.
Then you can read the CSV file on R console and  assign data to variable.

> OrangeTree <- read.csv("D:/R/Download/OrangeTree.csv" )
> OrangeTree
   Tree  age circumference
1     1  118            30
2     1  484            58
3     1  664            87
4     1 1004           115
5     1 1231           120
6     1 1372           142
7     1 1582           145
8     2  118            33
9     2  484            69
10    2  664           111
* Assume that CSV file location is "D:\R\Download\OrangeTree.csv"

It seems like that  output data format is data frame.
You can add conditional formula.

> OrangeTree[age>1500,]
   Tree  age circumference
7     1 1582           145
14    2 1582           203
21    3 1582           140
28    4 1582           214
35    5 1582           177


Try it with your excel data.

댓글 없음:

댓글 쓰기