GitHub - LAAC-LSCP/zoo-babble-validation

Describing vocalizations in young children: A big data approach through citizen science annotation

This repo contains the code needed to reproduce a conference paper (202007_SLT) and a journal paper (202010_jslhr). Given that the latter is a later-produced expansion of data and analyses, we only provide information for reproducing the latter.

To reproduce the manuscript, you can simply knit 202010_jslhr/paper.Rmd. NOTE! If you are not in the LAAC team, this is probably the only step you can reproduce. If you want to reproduce the other steps, write to alecristia@gmail.com

To rerun the whole pipeline:

Put the following two files inside files_from_zooniverse/

maturity-of-baby-sounds-subjects.csv contains information about all the clips that have been put up on zooniverse
maturity-of-baby-sounds-classifications.csv contains information about all the classifications that have been done on zooniverse

These are very large, so they are not synced in the repo.

Run data_analyses/code/generate_jslhr_data.R. This will just select down the subject and classification data to the filenames in the PU dataset. In addition, this step will generate key_info.csv, which is used in paper.Rmd, as well as the following files, which are used in preprocess.R:

zoo_subj_info_pu.csv contains information about the PU clips that have been put up on zooniverse
zoo_anno_info_pu.csv contains information about the classifications that have been done on zooniverse for the PU clips specifically

Run 202010_jslhr/postprocess.R, which has three steps: cleaning errors in the classifications, generating chunk- and segment-level data. NOTE!! that this file generates data_analyses/output/clean_classifications.csv, another large file that is not synced but is necessary for this process. It'll generate the two key files which are used in paper.Rmd:

zoo_lab_maj_judgments.csv
chunks_maj_judgments.csv

knit 202010_jslhr/paper.Rmd

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
202007_SLT		202007_SLT
202010_jslhr		202010_jslhr
data_analyses		data_analyses
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases 1

Packages

Contributors 3

Uh oh!

Languages

License

LAAC-LSCP/zoo-babble-validation

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Uh oh!

Languages

Packages