본문 바로가기
파이썬/파이썬 pandas

[pandas] 행과 열의 형태 변형하기 melt, pivot, transpose

by Merware 2023. 5. 14.

데이터 준비하기

import pandas as pd
df = pd.read_csv('data/scores.csv')
df = df.head(2)
df

"""
	name	kor	eng	math
0	Aiden	100.0	90.0	95.0
1	Charles	90.0	80.0	75.0

 

모든 열 melt

  • df.melt( )
  • pd.melt(df)
df.melt()

"""
	variable	value
0	name	Aiden
1	name	Charles
2	kor	100.0
3	kor	90.0
4	eng	90.0
5	eng	80.0
6	math	95.0
7	math	75.0

 

고정할 컬럼 지정하여 melt

  • id_vars=[열이름리스트] --> 위치를 그대로 유지할 열 이름
df

"""
	name	kor	eng	math
0	Aiden	100.0	90.0	95.0
1	Charles	90.0	80.0	75.0
"""

# name 고정
df.melt(id_vars='name')
"""
	name	variable	value
0	Aiden	kor	100.0
1	Charles	kor	90.0
2	Aiden	eng	90.0
3	Charles	eng	80.0
4	Aiden	math	95.0
5	Charles	math	75.0
"""

# name, kor 고정
df.melt(id_vars=['name','kor'])
"""
	name	kor	variable	value
0	Aiden	100.0	eng	90.0
1	Charles	90.0	eng	80.0
2	Aiden	100.0	math	95.0
3	Charles	90.0	math	75.0

 

행으로 위치를 변경할 열 지정

  • value_vars=[열이름리스트]
# kor
df.melt(id_vars='name', value_vars='kor')
"""
	name	variable	value
0	Aiden	kor	100.0
1	Charles	kor	90.0


# kor, eng
df.melt(id_vars='name', value_vars=['kor','eng'])
"""
	name	variable	value
0	Aiden	kor	100.0
1	Charles	kor	90.0
2	Aiden	eng	90.0
3	Charles	eng	80.0

 

컬럼명 변경하기

  • var_name=컬럼명 --> value_vars로 위치를 변경한 열 이름
  • value_name=var_name으로 위치를 변경한 열의 데이터를 저장한 열 이름
# subject, score
df.melt(id_vars='name', value_vars=['kor','eng'], var_name='subject', value_name='score')

"""
	name	subject	score
0	Aiden	kor	100.0
1	Charles	kor	90.0
2	Aiden	eng	90.0
3	Charles	eng	80.0

 

열을 행으로 보내기(pivot)

# 샘플데이터
df = pd.read_csv('data/scores.csv')
df = df.head(2)
df = df.melt(id_vars = 'name', var_name='subject',value_name='score')

def get_grade(x):
    if x>=90: grade='A'
    elif x>=80: grade='B'
    elif x>=70: grade='C'
    elif x>=60: grade='D'
    else: grade='F'
    return grade

df['grade'] = df['score'].apply(get_grade)
df = df.sort_values('name')
df


"""
	name	subject	score	grade
0	Aiden	kor	100.0	A
2	Aiden	eng	90.0	A
4	Aiden	math	95.0	A
1	Charles	kor	90.0	A
3	Charles	eng	80.0	B
5	Charles	math	75.0	C

 

  • 데이터프레임.pivot( index=인덱스로 사용할 컬럼, columns=컬럼으로 사용할 컬럼, values=값으로 사용할 컬럼 )
# name, subject, score
df.pivot(index='name', columns='subject',values='score')

"""
subject	eng	kor	math
name			
Aiden	90.0	100.0	95.0
Charles	80.0	90.0	75.0
"""


# name, subject, grade
df.pivot(index='name', columns='subject',values='grade')
"""
subject	eng	kor	math
name			
Aiden	A	A	A
Charles	B	A	C
"""


# name, subject, [grade,score]
df.pivot(index='name', columns='subject',values=['score','grade'])
"""
			score		grade
subject	eng	kor	math	eng	kor	math
name						
Aiden	90.0	100.0	95.0	A	A	A
Charles	80.0	90.0	75.0	B	A	C
"""


# name, subject
df.pivot(index='name', columns='subject')
"""
			score		grade
subject	eng	kor	math	eng	kor	math
name						
Aiden	90.0	100.0	95.0	A	A	A
Charles	80.0	90.0	75.0	B	A	C

 

행과 열 바꾸기

  • 데이터프레임.transpose()
df.head()

"""
	name	subject	score	grade
0	Aiden	kor	100.0	A
2	Aiden	eng	90.0	A
4	Aiden	math	95.0	A
1	Charles	kor	90.0	A
3	Charles	eng	80.0	B
"""


df.transpose()
"""
	0	2	4	1	3	5
name	Aiden	Aiden	Aiden	Charles	Charles	Charles
subject	kor	eng	math	kor	eng	math
score	100.0	90.0	95.0	90.0	80.0	75.0
grade	A	A	A	A	B	C