BBJ data directory: \gpfs/data/im-lab/nas40t2/Data/BBJ

I first downloaded and decrypted Biobank Japan data (instructions), then organized into subdirectories BBJ-genotypes-decrypted and BBJ-phenotypes-decrypted, in their original form.

Phenotypes

BBJ phenotypes file: gpfs/data/im-lab/nas40t2/Data/BBJ/BBJ-phenotypes.csv

This CSV combines all phenotype data in the BBJ-phenotypes-decrypted subdirectory into one file. The original BBJ phenotype data in BBJ-phenotypes-decrypted, was bulky and used dataset IDs instead of phenotype names. The file BBJ-phenotype-list.txt contains all the phenotypes and their folder names (download)

I created the combined phenotype file with the following script:

python3 process-phenotypes.py --BBJ_folder /Users/sabrinami/BBJ/BBJ-phenotypes \
--phenotype_mapping /Users/sabrinami/Github/analysis-sabrina/BBJ-data-processing/BBJ-phenotype-list.txt \
--output /Users/sabrinami/Github/analysis-sabrina/BBJ-data-processing/BBJ-phenotypes.csv

Genotypes

BBJ genotypes folder: gpfs/data/im-lab/nas40t2/Data/BBJ/BBJ-genotypes-decrypted

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The source code is licensed under MIT.

Suggest changes

If you find any mistakes (including typos) or want to suggest changes, please feel free to edit the source file of this page on Github and create a pull request.

Citation

For attribution, please cite this work as

Sabrina Mi (2022). Biobank Japan Data in CRI. ImLab Notes. /post/2022/01/02/biobank-japan-data-in-cri/

BibTeX citation

@misc{
  title = "Biobank Japan Data in CRI",
  author = "Sabrina Mi",
  year = "2022",
  journal = "ImLab Notes",
  note = "/post/2022/01/02/biobank-japan-data-in-cri/"
}