Reputation: 527
I am analyzing DNA/Protein sequence data with python and got a problem. Here is the table of DNA sequence.
I want to analyze them as group1 and group2 are pair. For example, AAATTT_TTTCCC or GGGCCC_GGAAA are pairs.
This sequence data sometimes shows same sequence. For instance, AAATTT appeared three times and AGTC did twice. I want to count this overlap sequence and summarize as below. I wonder I should use pandas, but don't know how to do this. If anyone could help this, I would be grateful with that very much.
Upvotes: 2
Views: 363
Reputation: 2945
To count the number of appearances of each unique value in a column:
# import pandas
import pandas as pd
# load data into Pandas dataframe
df = pd.read_csv("data.csv")
# get counts for each unique Group1 value
df["Group1"].value_counts()
Upvotes: 1