Reshaping data in R

Reshaping data is one of the important steps for data analysis and data transformation. Today we’ll see few of the methods and function in R. The tutorials is easy to follow. I’m using mtcars dataset which is available in R.

rbind()

rbind() stands for row bind and is a function which add the data serially. We’ll see the working with few examples. I’m going to use mtcars dataset which is available in R.

This is how rbind() works.

 

> head(mtcars)
mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Here we are creating two subsets data1 and data2

data1 <- mtcars[1:10,]
data2 <- mtcars[11:20,]
data3 <- mtcars[1:10,-11]

We’ll check the dimension of the created data frames. Check the number of rows and number of columns.

> dim(data1)
[1] 10 11
> dim(data2)
[1] 10 11
> dim(data3)
[1] 10 10

Columns should be same in both the data frames. As you can see below.

Rowbind <- rbind(data1, data2)
> dim(Rowbind)
[1] 20 11
Rowbind1 is throwing error because the columns are same in both the data frames.
Rowbind1 <- rbind(data2, data3)
Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match

 

cbind()

cbind() stands for column bind and is a function which add the data parallelly. We’ll see the working with few examples.

data4 <- mtcars[11:15,]
ColBind <- cbind(data1,data2)
dim(ColBind)
[1] 10 22
ColBind1 <- cbind(data1,data4)
dim(ColBind1)
[1] 10 22

If length are not same it will keep repeating the rows as you can see below.

cbind(1:2, 1:10)
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 1 3
[4,] 2 4
[5,] 1 5
[6,] 2 6
[7,] 1 7
[8,] 2 8
[9,] 1 9
[10,] 2 10

 

Merge

Merge() two data frames by common columns or row names, or do other versions of database join operations.

Lets create sample subset of the mtcars dataset.

data5 <- mtcars[sample(1:nrow(mtcars),10),]
data6 <- mtcars[sample(1:nrow(mtcars),10),]

If column names are same

merge(data5,data6,by = "mpg" )

If column names are not same it should be mentioned separately

merge(data5,data6,by.x = "mpg",by.y="mpg" )

 

 

Further readings and sources:

Merge

cbind()

cbind() 1

rbind()

 

Keep visiting Analytics Tuts for more tutorials.

Thanks for reading! Comment your suggestions and queries.

 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *