Command

Spliting fasta

$ faidx -x reference.fa

chr1.fa chr2.fa ... chrY.fa

$ samtools faidx reference.fa chr1 > reference_chr1.fa

Indexing fasta

$ samtools faidx chr21.fa

$ cat chr21.fa.fai

chr21 46708999 7 60 61

NAME	Name of this reference sequence
LENGTH	Total length of this reference sequence, in bases
OFFSET	Offset in the FASTA/FASTQ file of this sequence's first base
LINEBASES	The number of bases on each line
LINEWIDTH	The number of bytes in each line, including the newline
QUALOFFSET	Offset of sequence's first quality within the FASTQ file

$ alias prl="bash -c '(for i in {1..22};do eval echo $@ ;done) |parallel \"{}\" ' _"

$ prl 'bcftools view -r ${i} data.vcf.gz -Oz -o data_chr${i}.vcf.gz'

Sort

$ (grep ^"#" data.vcf; grep -v ^"#" data.vcf | sort -k1,1V -k2,2g) > sorted.vcf

$ find path -type f -name "*.txt" -print0 | sort -zV | xargs -0 cat | sort -g -k 9 -o outfilename

Grouping

df1=df.groupby('GENE')['SNP'].apply(' '.join).reset_index()

df2=df1['GENE'].to_frame()
df3=df1['SNP'].str.split(' ',expand=True).fillna('')
df4=df2.join(df3,how='right')
df4.to_csv('groupFile_geneBasedtest.txt',header=None,index=None,sep='\t')

https://stackoverflow.com/questions/36271413/pandas-merge-nearly-duplicate-rows-based-on-column-value

Space-separate to Tab-separate

awk OFS="\t" '{$1=$1}1' data.txt > data_tab.txt

OFS="\t" # set output separator as a tab
{$1=$1}  # remove extra spaces and set OFS as tab
1        # with awk, true, so print the current line

https://stackoverflow.com/questions/59472326/what-does-this-mean-awk-ofs-t-1-11-filepath

Insert 0

num=2

a=format(num, '03')
b={0:04d}.format(num)

a=002
b=0002

Split a file

$ split -d -n r/4 data.txt data_ # round robin way to split lines divide by 4
data_00 data_01 data_02 data_03

$ vi ~/.config/matplotlib/matplotlibrc
backend: TkAgg

$ rpm -ql libxml2-devel

in R
Sys.setenv(XML_CONFIG="/usr/bin/xml2-config")

저작자표시

'Tab' 카테고리의 다른 글

참고 (0)	2020.07.28
링크 (0)	2020.07.28
Glossary (0)	2020.07.09

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Analytic reasoning

Command

'Tab' 카테고리의 다른 글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

Command

'Tab' 카테고리의 다른 글

관련글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역