diff --git a/.Rbuildignore b/.Rbuildignore index 4c43c07..5a4c8b8 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -4,6 +4,6 @@ ^_pkgdown\.yml$ ^pkgdown$ ^docs$ -^CONDUCT\.md$ +^CODE_OF_CONDUCT\.md$ ^CONTRIBUTING\.md$ ^\.github$ diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md new file mode 100644 index 0000000..5b061ad --- /dev/null +++ b/CODE_OF_CONDUCT.md @@ -0,0 +1,126 @@ +# Contributor Covenant Code of Conduct + +## Our Pledge + +We as members, contributors, and leaders pledge to make participation in our +community a harassment-free experience for everyone, regardless of age, body +size, visible or invisible disability, ethnicity, sex characteristics, gender +identity and expression, level of experience, education, socio-economic status, +nationality, personal appearance, race, caste, color, religion, or sexual +identity and orientation. + +We pledge to act and interact in ways that contribute to an open, welcoming, +diverse, inclusive, and healthy community. + +## Our Standards + +Examples of behavior that contributes to a positive environment for our +community include: + +* Demonstrating empathy and kindness toward other people +* Being respectful of differing opinions, viewpoints, and experiences +* Giving and gracefully accepting constructive feedback +* Accepting responsibility and apologizing to those affected by our mistakes, + and learning from the experience +* Focusing on what is best not just for us as individuals, but for the overall + community + +Examples of unacceptable behavior include: + +* The use of sexualized language or imagery, and sexual attention or advances of + any kind +* Trolling, insulting or derogatory comments, and personal or political attacks +* Public or private harassment +* Publishing others' private information, such as a physical or email address, + without their explicit permission +* Other conduct which could reasonably be considered inappropriate in a + professional setting + +## Enforcement Responsibilities + +Community leaders are responsible for clarifying and enforcing our standards of +acceptable behavior and will take appropriate and fair corrective action in +response to any behavior that they deem inappropriate, threatening, offensive, +or harmful. + +Community leaders have the right and responsibility to remove, edit, or reject +comments, commits, code, wiki edits, issues, and other contributions that are +not aligned to this Code of Conduct, and will communicate reasons for moderation +decisions when appropriate. + +## Scope + +This Code of Conduct applies within all community spaces, and also applies when +an individual is officially representing the community in public spaces. +Examples of representing our community include using an official e-mail address, +posting via an official social media account, or acting as an appointed +representative at an online or offline event. + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be +reported to the community leaders responsible for enforcement at me@nanx.me. +All complaints will be reviewed and investigated promptly and fairly. + +All community leaders are obligated to respect the privacy and security of the +reporter of any incident. + +## Enforcement Guidelines + +Community leaders will follow these Community Impact Guidelines in determining +the consequences for any action they deem in violation of this Code of Conduct: + +### 1. Correction + +**Community Impact**: Use of inappropriate language or other behavior deemed +unprofessional or unwelcome in the community. + +**Consequence**: A private, written warning from community leaders, providing +clarity around the nature of the violation and an explanation of why the +behavior was inappropriate. A public apology may be requested. + +### 2. Warning + +**Community Impact**: A violation through a single incident or series of +actions. + +**Consequence**: A warning with consequences for continued behavior. No +interaction with the people involved, including unsolicited interaction with +those enforcing the Code of Conduct, for a specified period of time. This +includes avoiding interactions in community spaces as well as external channels +like social media. Violating these terms may lead to a temporary or permanent +ban. + +### 3. Temporary Ban + +**Community Impact**: A serious violation of community standards, including +sustained inappropriate behavior. + +**Consequence**: A temporary ban from any sort of interaction or public +communication with the community for a specified period of time. No public or +private interaction with the people involved, including unsolicited interaction +with those enforcing the Code of Conduct, is allowed during this period. +Violating these terms may lead to a permanent ban. + +### 4. Permanent Ban + +**Community Impact**: Demonstrating a pattern of violation of community +standards, including sustained inappropriate behavior, harassment of an +individual, or aggression toward or disparagement of classes of individuals. + +**Consequence**: A permanent ban from any sort of public interaction within the +community. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant][homepage], +version 2.1, available at +. + +Community Impact Guidelines were inspired by +[Mozilla's code of conduct enforcement ladder][https://github.com/mozilla/inclusion]. + +For answers to common questions about this code of conduct, see the FAQ at +. Translations are available at . + +[homepage]: https://www.contributor-covenant.org diff --git a/CONDUCT.md b/CONDUCT.md deleted file mode 100644 index 9dc65e7..0000000 --- a/CONDUCT.md +++ /dev/null @@ -1,25 +0,0 @@ -# Contributor Code of Conduct - -As contributors and maintainers of this project, we pledge to respect all people who -contribute through reporting issues, posting feature requests, updating documentation, -submitting pull requests or patches, and other activities. - -We are committed to making participation in this project a harassment-free experience for -everyone, regardless of level of experience, gender, gender identity and expression, -sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion. - -Examples of unacceptable behavior by participants include the use of sexual language or -imagery, derogatory comments or personal attacks, trolling, public or private harassment, -insults, or other unprofessional conduct. - -Project maintainers have the right and responsibility to remove, edit, or reject comments, -commits, code, wiki edits, issues, and other contributions that are not aligned to this -Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed -from the project team. - -Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by -opening an issue or contacting one or more of the project maintainers. - -This Code of Conduct is adapted from the Contributor Covenant -(https://contributor-covenant.org), version 1.0.0, available at -https://contributor-covenant.org/version/1/0/0/. diff --git a/DESCRIPTION b/DESCRIPTION index 42222a6..a25afb0 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -32,4 +32,4 @@ Suggests: foreach, doParallel, org.Hs.eg.db -RoxygenNote: 7.3.1 +RoxygenNote: 7.3.2 diff --git a/R/desc-03-TC.R b/R/desc-03-TC.R index f03703b..fd7c9c8 100644 --- a/R/desc-03-TC.R +++ b/R/desc-03-TC.R @@ -930,8 +930,8 @@ extractTC <- function(x) { n <- nchar(x) TC <- summary(factor( paste(paste(xSplitted[-c(n, n - 1)], xSplitted[-c(1, n)], sep = ""), - xSplitted[-c(1, 2)], - sep = "" + xSplitted[-c(1, 2)], + sep = "" ), levels = TCDict ), maxsum = 8001) / (n - 2) diff --git a/R/desc-04-MoreauBroto.R b/R/desc-04-MoreauBroto.R index de5fc14..8767ecb 100644 --- a/R/desc-04-MoreauBroto.R +++ b/R/desc-04-MoreauBroto.R @@ -101,11 +101,11 @@ #' ) #' ) extractMoreauBroto <- function( - x, props = c( - "CIDH920105", "BHAR880101", "CHAM820101", "CHAM820102", - "CHOC760101", "BIGC670101", "CHAM810101", "DAYM780201" - ), - nlag = 30L, customprops = NULL) { + x, props = c( + "CIDH920105", "BHAR880101", "CHAM820101", "CHAM820102", + "CHOC760101", "BIGC670101", "CHAM810101", "DAYM780201" + ), + nlag = 30L, customprops = NULL) { if (protcheck(x) == FALSE) { stop("x has unrecognized amino acid type") } diff --git a/R/desc-05-Moran.R b/R/desc-05-Moran.R index b624266..ab42977 100644 --- a/R/desc-05-Moran.R +++ b/R/desc-05-Moran.R @@ -101,11 +101,11 @@ #' ) #' ) extractMoran <- function( - x, props = c( - "CIDH920105", "BHAR880101", "CHAM820101", "CHAM820102", - "CHOC760101", "BIGC670101", "CHAM810101", "DAYM780201" - ), - nlag = 30L, customprops = NULL) { + x, props = c( + "CIDH920105", "BHAR880101", "CHAM820101", "CHAM820102", + "CHOC760101", "BIGC670101", "CHAM810101", "DAYM780201" + ), + nlag = 30L, customprops = NULL) { if (protcheck(x) == FALSE) { stop("x has unrecognized amino acid type") } diff --git a/R/desc-06-Geary.R b/R/desc-06-Geary.R index 40e6e25..99efb78 100644 --- a/R/desc-06-Geary.R +++ b/R/desc-06-Geary.R @@ -101,11 +101,11 @@ #' ) #' ) extractGeary <- function( - x, props = c( - "CIDH920105", "BHAR880101", "CHAM820101", "CHAM820102", - "CHOC760101", "BIGC670101", "CHAM810101", "DAYM780201" - ), - nlag = 30L, customprops = NULL) { + x, props = c( + "CIDH920105", "BHAR880101", "CHAM820101", "CHAM820102", + "CHOC760101", "BIGC670101", "CHAM810101", "DAYM780201" + ), + nlag = 30L, customprops = NULL) { if (protcheck(x) == FALSE) { stop("x has unrecognized amino acid type") } diff --git a/R/desc-07-CTDCClass.R b/R/desc-07-CTDCClass.R index 18ea0ea..423a09c 100644 --- a/R/desc-07-CTDCClass.R +++ b/R/desc-07-CTDCClass.R @@ -72,8 +72,8 @@ extractCTDCClass <- function(x, aagroup1, aagroup2, aagroup3) { if (protcheck(x) == FALSE) stop("x has unrecognized amino acid type") if ((length(aagroup1) != length(aagroup2) | - length(aagroup1) != length(aagroup3)) | - (length(aagroup2) != length(aagroup3))) { + length(aagroup1) != length(aagroup3)) | + (length(aagroup2) != length(aagroup3))) { stop("The three groups must have the same property numbers") } diff --git a/R/desc-08-CTDT.R b/R/desc-08-CTDT.R index 6cb2063..4b43322 100644 --- a/R/desc-08-CTDT.R +++ b/R/desc-08-CTDT.R @@ -88,12 +88,13 @@ extractCTDT <- function(x) { # Combine single amino acids by a 2-length step for (i in 1:7) G[[i]] <- paste(G[[i]][-n], G[[i]][-1], sep = "") - G <- lapply(G, function(x) + G <- lapply(G, function(x) { factor(x, levels = c( "G1G2", "G2G1", "G1G3", "G3G1", "G2G3", "G3G2", "G1G1", "G2G2", "G3G3" - ))) + )) + }) GSummary <- lapply(G, summary) diff --git a/R/desc-08-CTDTClass.R b/R/desc-08-CTDTClass.R index f1b4a0d..44fe0df 100644 --- a/R/desc-08-CTDTClass.R +++ b/R/desc-08-CTDTClass.R @@ -74,8 +74,8 @@ extractCTDTClass <- function(x, aagroup1, aagroup2, aagroup3) { } if ((length(aagroup1) != length(aagroup2) | - length(aagroup1) != length(aagroup3)) | - (length(aagroup2) != length(aagroup3))) { + length(aagroup1) != length(aagroup3)) | + (length(aagroup2) != length(aagroup3))) { stop("The three groups must have the same property numbers") } diff --git a/R/desc-09-CTDDClass.R b/R/desc-09-CTDDClass.R index a80d8c4..d4bc379 100644 --- a/R/desc-09-CTDDClass.R +++ b/R/desc-09-CTDDClass.R @@ -74,8 +74,8 @@ extractCTDDClass <- function(x, aagroup1, aagroup2, aagroup3) { } if ((length(aagroup1) != length(aagroup2) | - length(aagroup1) != length(aagroup3)) | - (length(aagroup2) != length(aagroup3))) { + length(aagroup1) != length(aagroup3)) | + (length(aagroup2) != length(aagroup3))) { stop("The three groups must have the same property numbers") } diff --git a/R/desc-10-CTriad.R b/R/desc-10-CTriad.R index 7fd6e7a..eee6b90 100644 --- a/R/desc-10-CTriad.R +++ b/R/desc-10-CTriad.R @@ -1781,14 +1781,16 @@ extractCTriad <- function(x) { xSplitted <- strsplit(x, split = "")[[1]] n <- nchar(x) - CTAll <- summary(factor(paste(paste( - xSplitted[-c(n, n - 1)], xSplitted[-c(1, n)], - sep = "" - ), - xSplitted[-c(1, 2)], - sep = "" - ), - levels = CTDict + CTAll <- summary(factor( + paste( + paste( + xSplitted[-c(n, n - 1)], xSplitted[-c(1, n)], + sep = "" + ), + xSplitted[-c(1, 2)], + sep = "" + ), + levels = CTDict ), maxsum = 8001) MatchedIndex <- which(CTAll != 0) diff --git a/R/desc-10-CTriadClass.R b/R/desc-10-CTriadClass.R index db0a889..8e281ad 100644 --- a/R/desc-10-CTriadClass.R +++ b/R/desc-10-CTriadClass.R @@ -75,14 +75,16 @@ extractCTriadClass <- function(x, aaclass) { xSplitted <- strsplit(x, split = "")[[1L]] n <- nchar(x) - CTAll <- summary(factor(paste(paste( - xSplitted[-c(n, n - 1L)], xSplitted[-c(1L, n)], - sep = "" - ), - xSplitted[-c(1L, 2L)], - sep = "" - ), - levels = CTDict + CTAll <- summary(factor( + paste( + paste( + xSplitted[-c(n, n - 1L)], xSplitted[-c(1L, n)], + sep = "" + ), + xSplitted[-c(1L, 2L)], + sep = "" + ), + levels = CTDict ), maxsum = length(CTDict) + 1L) MatchedIndex <- which(CTAll != 0L) diff --git a/R/desc-13-PAAC.R b/R/desc-13-PAAC.R index 725355d..1c44f7a 100644 --- a/R/desc-13-PAAC.R +++ b/R/desc-13-PAAC.R @@ -101,8 +101,8 @@ #' ) #' ) extractPAAC <- function( - x, props = c("Hydrophobicity", "Hydrophilicity", "SideChainMass"), - lambda = 30, w = 0.05, customprops = NULL) { + x, props = c("Hydrophobicity", "Hydrophilicity", "SideChainMass"), + lambda = 30, w = 0.05, customprops = NULL) { if (protcheck(x) == FALSE) { stop("x has unrecognized amino acid type") } diff --git a/R/desc-14-APAAC.R b/R/desc-14-APAAC.R index c2fd7af..92c77a5 100644 --- a/R/desc-14-APAAC.R +++ b/R/desc-14-APAAC.R @@ -96,8 +96,8 @@ #' ) #' ) extractAPAAC <- function( - x, props = c("Hydrophobicity", "Hydrophilicity"), - lambda = 30, w = 0.05, customprops = NULL) { + x, props = c("Hydrophobicity", "Hydrophilicity"), + lambda = 30, w = 0.05, customprops = NULL) { if (protcheck(x) == FALSE) { stop("x has unrecognized amino acid type") } diff --git a/R/desc-15-PSSM.R b/R/desc-15-PSSM.R index faf4c76..1552346 100644 --- a/R/desc-15-PSSM.R +++ b/R/desc-15-PSSM.R @@ -122,20 +122,20 @@ #' dim(pssmmat) # 20 x 562 (P00750: length 562, 20 Amino Acids) #' } extractPSSM <- function( - seq, start.pos = 1L, end.pos = nchar(seq), - psiblast.path = NULL, makeblastdb.path = NULL, - database.path = NULL, iter = 5, silent = TRUE, - evalue = 10L, word.size = NULL, - gapopen = NULL, gapextend = NULL, - matrix = "BLOSUM62", threshold = NULL, - seg = "no", soft.masking = FALSE, - culling.limit = NULL, best.hit.overhang = NULL, - best.hit.score.edge = NULL, - xdrop.ungap = NULL, xdrop.gap = NULL, - xdrop.gap.final = NULL, - window.size = NULL, gap.trigger = 22L, - num.threads = 1L, pseudocount = 0L, - inclusion.ethresh = 0.002) { + seq, start.pos = 1L, end.pos = nchar(seq), + psiblast.path = NULL, makeblastdb.path = NULL, + database.path = NULL, iter = 5, silent = TRUE, + evalue = 10L, word.size = NULL, + gapopen = NULL, gapextend = NULL, + matrix = "BLOSUM62", threshold = NULL, + seg = "no", soft.masking = FALSE, + culling.limit = NULL, best.hit.overhang = NULL, + best.hit.score.edge = NULL, + xdrop.ungap = NULL, xdrop.gap = NULL, + xdrop.gap.final = NULL, + window.size = NULL, gap.trigger = 22L, + num.threads = 1L, pseudocount = 0L, + inclusion.ethresh = 0.002) { if (Sys.which("makeblastdb") == "" & is.null(makeblastdb.path)) { stop("Please install makeblastdb (included in NCBI BLAST+) or specify makeblastdb.path") } diff --git a/R/desc-15-PSSMAcc.R b/R/desc-15-PSSMAcc.R index bdca3f5..1f2defc 100644 --- a/R/desc-15-PSSMAcc.R +++ b/R/desc-15-PSSMAcc.R @@ -45,7 +45,6 @@ #' tail(pssmacc) #' } extractPSSMAcc <- function(pssmmat, lag) { - # Normalize PSSM scores to (0, 1) mat <- 1 / (1 + exp(pssmmat)) diff --git a/R/desc-15-PSSMFeature.R b/R/desc-15-PSSMFeature.R index f98b71b..9cd8845 100644 --- a/R/desc-15-PSSMFeature.R +++ b/R/desc-15-PSSMFeature.R @@ -52,7 +52,6 @@ #' head(pssmfeature) #' } extractPSSMFeature <- function(pssmmat) { - # Normalize PSSM scores to (0, 1) res <- as.vector(1 / (1 + exp(pssmmat))) diff --git a/R/misc-01-readFASTA.R b/R/misc-01-readFASTA.R index 26c82c4..f58a44f 100755 --- a/R/misc-01-readFASTA.R +++ b/R/misc-01-readFASTA.R @@ -41,9 +41,8 @@ #' @examples #' P00750 <- readFASTA(system.file("protseq/P00750.fasta", package = "protr")) readFASTA <- function( - file = system.file("protseq/P00750.fasta", package = "protr"), - legacy.mode = TRUE, seqonly = FALSE) { - + file = system.file("protseq/P00750.fasta", package = "protr"), + legacy.mode = TRUE, seqonly = FALSE) { # Read the FASTA file as a vector of strings lines <- readLines(file) @@ -74,7 +73,9 @@ readFASTA <- function( function(i) paste(lines[start[i]:end[i]], collapse = "") ) - if (seqonly) return(sequences) + if (seqonly) { + return(sequences) + } # Read sequence names nomseq <- lapply(seq_len(nseq), function(i) { diff --git a/R/misc-04-protseg.R b/R/misc-04-protseg.R index 2619451..cfc767e 100644 --- a/R/misc-04-protseg.R +++ b/R/misc-04-protseg.R @@ -26,12 +26,12 @@ #' x <- readFASTA(system.file("protseq/P00750.fasta", package = "protr"))[[1]] #' protseg(x, aa = "R", k = 5) protseg <- function( - x, aa = c( - "A", "R", "N", "D", "C", - "E", "Q", "G", "H", "I", - "L", "K", "M", "F", "P", - "S", "T", "W", "Y", "V" - ), k = 7) { + x, aa = c( + "A", "R", "N", "D", "C", + "E", "Q", "G", "H", "I", + "L", "K", "M", "F", "P", + "S", "T", "W", "Y", "V" + ), k = 7) { aa <- match.arg(aa) xSplitted <- strsplit(x, split = "")[[1]] diff --git a/R/misc-07-removeGaps.R b/R/misc-07-removeGaps.R index 7c67809..384e1e3 100644 --- a/R/misc-07-removeGaps.R +++ b/R/misc-07-removeGaps.R @@ -33,5 +33,6 @@ #' nogapseq <- removeGaps(aaseq) #' parSeqSim(nogapseq) #' } -removeGaps <- function(x, pattern = "-", replacement = "", ...) +removeGaps <- function(x, pattern = "-", replacement = "", ...) { gsub(pattern, replacement, x, ...) +} diff --git a/R/par-02-parGOSim.R b/R/par-02-parGOSim.R index 631b259..15fdf5c 100644 --- a/R/par-02-parGOSim.R +++ b/R/par-02-parGOSim.R @@ -82,10 +82,10 @@ #' twoGOSim(gene1, gene2, type = "gene", ont = "BP", measure = "Lin") #' } twoGOSim <- function( - id1, id2, - type = c("go", "gene"), - ont = c("MF", "BP", "CC"), organism = "human", - measure = "Resnik", combine = "BMA") { + id1, id2, + type = c("go", "gene"), + ont = c("MF", "BP", "CC"), organism = "human", + measure = "Resnik", combine = "BMA") { type <- match.arg(type) ont <- match.arg(ont) godb <- suppressMessages(GOSemSim::godata(OrgDb = .orgdb[organism], ont = ont, computeIC = TRUE)) @@ -117,10 +117,11 @@ twoGOSim <- function( id2good <- 1:length(golist[[id2]]) gid1 <- as.character(golist[[id1]][id1good]) gid2 <- as.character(golist[[id2]][id2good]) - res <- try(suppressWarnings( - GOSemSim::mgoSim(gid1, gid2, semData = godb, measure = measure, combine = combine) - ), - silent = TRUE + res <- try( + suppressWarnings( + GOSemSim::mgoSim(gid1, gid2, semData = godb, measure = measure, combine = combine) + ), + silent = TRUE ) if (is.numeric(res)) { sim <- res @@ -204,10 +205,10 @@ twoGOSim <- function( #' parGOSim(genelist, type = "gene", ont = "BP", measure = "Wang") #' } parGOSim <- function( - golist, - type = c("go", "gene"), - ont = c("MF", "BP", "CC"), organism = "human", - measure = "Resnik", combine = "BMA") { + golist, + type = c("go", "gene"), + ont = c("MF", "BP", "CC"), organism = "human", + measure = "Resnik", combine = "BMA") { type <- match.arg(type) ont <- match.arg(ont) godb <- suppressMessages(GOSemSim::godata(OrgDb = .orgdb[organism], ont = ont, computeIC = TRUE)) diff --git a/R/pcm-01-extractScalesGap.R b/R/pcm-01-extractScalesGap.R index 6835dae..57aa56d 100644 --- a/R/pcm-01-extractScalesGap.R +++ b/R/pcm-01-extractScalesGap.R @@ -53,7 +53,7 @@ #' AAidxmat <- t(na.omit(as.matrix(AAindex[, 7:26]))) #' scales <- extractScalesGap(x, propmat = AAidxmat, pc = 5, lag = 7, silent = FALSE) extractScalesGap <- function( - x, propmat, pc, lag, scale = TRUE, silent = TRUE) { + x, propmat, pc, lag, scale = TRUE, silent = TRUE) { if (.protcheckgap(x) == FALSE) { stop('x has unrecognized amino acid types. Note: use "-" to represent gaps.') } diff --git a/R/pcm-02-extractDescScales.R b/R/pcm-02-extractDescScales.R index 910c646..26a7b0c 100644 --- a/R/pcm-02-extractDescScales.R +++ b/R/pcm-02-extractDescScales.R @@ -46,7 +46,7 @@ #' pc = 5, lag = 7, silent = FALSE #' ) extractDescScales <- function( - x, propmat, index = NULL, pc, lag, scale = TRUE, silent = TRUE) { + x, propmat, index = NULL, pc, lag, scale = TRUE, silent = TRUE) { propmat <- get(propmat) if (!is.null(index)) propmat <- propmat[, index] diff --git a/R/pcm-03-extractProtFP.R b/R/pcm-03-extractProtFP.R index 49e0689..541a4a6 100644 --- a/R/pcm-03-extractProtFP.R +++ b/R/pcm-03-extractProtFP.R @@ -33,7 +33,7 @@ #' x <- readFASTA(system.file("protseq/P00750.fasta", package = "protr"))[[1]] #' fp <- extractProtFP(x, index = c(160:165, 258:296), pc = 5, lag = 7, silent = FALSE) extractProtFP <- function( - x, index = NULL, pc, lag, scale = TRUE, silent = TRUE) { + x, index = NULL, pc, lag, scale = TRUE, silent = TRUE) { propmat <- get("AAindex") if (!is.null(index)) { diff --git a/R/pcm-03-extractProtFPGap.R b/R/pcm-03-extractProtFPGap.R index b7ae5b3..7079640 100644 --- a/R/pcm-03-extractProtFPGap.R +++ b/R/pcm-03-extractProtFPGap.R @@ -36,7 +36,7 @@ #' x <- readFASTA(system.file("protseq/align.fasta", package = "protr"))$`IXI_235` #' fp <- extractProtFPGap(x, index = c(160:165, 258:296), pc = 5, lag = 7, silent = FALSE) extractProtFPGap <- function( - x, index = NULL, pc, lag, scale = TRUE, silent = TRUE) { + x, index = NULL, pc, lag, scale = TRUE, silent = TRUE) { propmat <- get("AAindex") if (!is.null(index)) { diff --git a/R/pcm-04-extractFAScales.R b/R/pcm-04-extractFAScales.R index 462d754..a75d4d5 100644 --- a/R/pcm-04-extractFAScales.R +++ b/R/pcm-04-extractFAScales.R @@ -43,8 +43,8 @@ #' tprops <- AATopo[, c(37:41, 43:47)] # select a set of topological descriptors #' fa <- extractFAScales(x, propmat = tprops, factors = 5, lag = 7, silent = FALSE) extractFAScales <- function( - x, propmat, factors, scores = "regression", - lag, scale = TRUE, silent = TRUE) { + x, propmat, factors, scores = "regression", + lag, scale = TRUE, silent = TRUE) { if (protcheck(x) == FALSE) { stop("x has unrecognized amino acid type") } diff --git a/R/pcm-05-extractMDSScales.R b/R/pcm-05-extractMDSScales.R index 6a6eeba..a0383b5 100644 --- a/R/pcm-05-extractMDSScales.R +++ b/R/pcm-05-extractMDSScales.R @@ -42,7 +42,7 @@ #' tprops <- AATopo[, c(37:41, 43:47)] # select a set of topological descriptors #' mds <- extractMDSScales(x, propmat = tprops, k = 5, lag = 7, silent = FALSE) extractMDSScales <- function( - x, propmat, k, lag, scale = TRUE, silent = TRUE) { + x, propmat, k, lag, scale = TRUE, silent = TRUE) { if (protcheck(x) == FALSE) { stop("x has unrecognized amino acid type") } diff --git a/R/pcm-06-extractBLOSUM.R b/R/pcm-06-extractBLOSUM.R index 4dbf9a4..edf61bb 100644 --- a/R/pcm-06-extractBLOSUM.R +++ b/R/pcm-06-extractBLOSUM.R @@ -36,7 +36,7 @@ #' x <- readFASTA(system.file("protseq/P00750.fasta", package = "protr"))[[1]] #' blosum <- extractBLOSUM(x, submat = "AABLOSUM62", k = 5, lag = 7, scale = TRUE, silent = FALSE) extractBLOSUM <- function( - x, submat = "AABLOSUM62", k, lag, scale = TRUE, silent = TRUE) { + x, submat = "AABLOSUM62", k, lag, scale = TRUE, silent = TRUE) { if (protcheck(x) == FALSE) { stop("x has unrecognized amino acid type") } diff --git a/README.md b/README.md index bbeb359..f168a9e 100644 --- a/README.md +++ b/README.md @@ -119,4 +119,8 @@ GO semantic similarity measures: ## Contribute -To contribute to this project, please take a look at the [Contributing Guidelines](https://nanx.me/protr/CONTRIBUTING.html) first. Please note that this project is released with a [Contributor Code of Conduct](https://nanx.me/protr/CONDUCT.html). By participating in this project you agree to abide by its terms. +To contribute to this project, please take a look at the +[Contributing Guidelines](https://nanx.me/protr/CONTRIBUTING.html) first. +Please note that the protr project is released with a +[Contributor Code of Conduct](https://nanx.me/protr/CODE_OF_CONDUCT.html). +By contributing to this project, you agree to abide by its terms. diff --git a/_pkgdown.yml b/_pkgdown.yml index 5fb980a..c3a399a 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -6,9 +6,9 @@ template: preset: "bootstrap" reference: - - title: "Protein and peptide sequence descriptors" + - title: "Protein sequence descriptors" desc: > - Functions for calculating protein/peptide sequence descriptors. + Calculate protein and peptide sequence descriptors. contents: - extractAAC - extractDC @@ -28,17 +28,17 @@ reference: - extractQSO - extractPAAC - extractAPAAC - - title: "Profile-based protein and peptide sequence descriptors" + - title: "PSSM descriptors" desc: > - Functions for calculating the profile-based (PSSM) descriptors. + Calculate PSSM (profile-based) protein and peptide sequence descriptors. contents: - extractPSSM - extractPSSMAcc - extractPSSMFeature - acc - - title: "Proteochemometric Modeling (PCM) descriptors" + - title: "PCM descriptors" desc: > - Functions for calculating proteochemometric modeling (PCM) descriptors. + Calculate PCM (proteochemometric modeling) descriptors. contents: - extractScales - extractScalesGap @@ -50,7 +50,7 @@ reference: - extractBLOSUM - title: "Similarity measures between proteins" desc: > - Functions for calculating protein sequence alignment based + Calculate protein sequence alignment based similarity measures and GO-based semantic similarity measures. contents: - parSeqSim @@ -62,7 +62,7 @@ reference: - twoGOSim - title: "Pre-process protein sequences" desc: > - Helper Functions for pre-processing protein sequences. + Helper functions for pre-processing protein sequences. contents: - getUniProt - readFASTA @@ -71,7 +71,7 @@ reference: - protseg - removeGaps - protr-package - - title: "Precomputed molecular descriptors for the 20 amino acids" + - title: "Precomputed molecular descriptors" desc: > Precomputed molecular descriptors for the 20 amino acids. contents: @@ -101,7 +101,7 @@ reference: - AATopoChg - AAWalk - AAWHIM - - title: "BLOSUM and PAM matrices for the 20 amino acids" + - title: "BLOSUM and PAM matrices" desc: > BLOSUM and PAM matrices for the 20 amino acids. contents: diff --git a/vignettes/bioinformatics.csl b/vignettes/bioinformatics.csl deleted file mode 100644 index 7d2c861..0000000 --- a/vignettes/bioinformatics.csl +++ /dev/null @@ -1,134 +0,0 @@ - - diff --git a/vignettes/custom.css b/vignettes/custom.css index c503fde..e7138b3 100644 --- a/vignettes/custom.css +++ b/vignettes/custom.css @@ -1,5 +1,3 @@ -/* custom css style for Nan Xiao's R package vignettes */ - body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, "Noto Sans", sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color Emoji"; font-size: 16px; @@ -13,12 +11,18 @@ h1.title { padding-bottom: 10px; } -h1, h2, h3, h4, h5, h6 { +h1, +h2, +h3, +h4, +h5, +h6 { color: #212529; font-weight: 700; } -h1, h1.title { +h1, +h1.title { font-size: 30px; } @@ -66,7 +70,9 @@ a { color: #4582EC; } -a:hover, a:focus, a:active { +a:hover, +a:focus, +a:active { color: #1559CF; } @@ -74,11 +80,13 @@ a:focus { outline: thin dotted; } -a:hover, a:active { +a:hover, +a:active { outline: 0; } -pre, code { +pre, +code { font-family: SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace; background-color: #f8f9fa; border: none; @@ -86,7 +94,9 @@ pre, code { /* float toc */ -.list-group-item.active, .list-group-item.active:focus, .list-group-item.active:hover { +.list-group-item.active, +.list-group-item.active:focus, +.list-group-item.active:hover { background-color: #4582EC; border-color: #4582EC; } @@ -108,7 +118,8 @@ p.caption { color: #4287c7; } -.remark-code, .remark-inline-code { +.remark-code, +.remark-inline-code { font-family: SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace; background-color: #c8c8c8; } @@ -122,11 +133,14 @@ p.caption { padding-top: 110px; } -.title-slide h1, .title-slide h2, .title-slide h3 { +.title-slide h1, +.title-slide h2, +.title-slide h3 { color: #585858; } -.title-slide h1, .title-slide h2 { +.title-slide h1, +.title-slide h2 { font-weight: 700; margin-top: 20px; margin-bottom: 80px; @@ -142,7 +156,8 @@ p.caption { background-color: #789d57; } -.dark-green h2, .dark-green h1 { +.dark-green h2, +.dark-green h1 { color: #fff; } @@ -150,7 +165,8 @@ p.caption { background-color: #585858; } -.dark-gray h2, .dark-gray h1 { +.dark-gray h2, +.dark-gray h1 { color: #fff; } @@ -163,7 +179,8 @@ p.caption { float: left; } -.left-column h2:last-of-type, .left-column h3:last-child { +.left-column h2:last-of-type, +.left-column h3:last-child { color: #000; } @@ -187,7 +204,9 @@ p.caption { clear: both; } -img, video, iframe { +img, +video, +iframe { max-width: 100%; } @@ -205,10 +224,13 @@ table thead th { border-bottom: 1px solid #ddd; } -th, td { +th, +td { padding: 5px; } -thead, tfoot, tr:nth-child(even) { +thead, +tfoot, +tr:nth-child(even) { background: #eee -} +} \ No newline at end of file diff --git a/vignettes/protr.Rmd b/vignettes/protr.Rmd index 7728bea..8960904 100644 --- a/vignettes/protr.Rmd +++ b/vignettes/protr.Rmd @@ -2,7 +2,6 @@ title: "protr: R package for generating various numerical representation schemes of protein sequences" author: "Nan Xiao <>" bibliography: protr.bib -csl: bioinformatics.csl output: rmarkdown::html_document: toc: true @@ -86,17 +85,21 @@ stored in two separated FASTA files with `readFASTA()`: ```{r} library("protr") -# load FASTA files -# (system.file is for accessing example file in protr package, -# replace it with your path) -extracell <- readFASTA(system.file( - "protseq/extracell.fasta", - package = "protr" -)) -mitonchon <- readFASTA(system.file( - "protseq/mitochondrion.fasta", - package = "protr" -)) +# Load FASTA files +# Note that `system.file()` is for accessing example files +# in the protr package. Replace it with your own file path. +extracell <- readFASTA( + system.file( + "protseq/extracell.fasta", + package = "protr" + ) +) +mitonchon <- readFASTA( + system.file( + "protseq/mitochondrion.fasta", + package = "protr" + ) +) ``` To read protein sequences stored in PDB format files, use `readPDB()` instead. @@ -153,12 +156,12 @@ i.e., the amphiphilic pseudo amino acid composition (APseAAC) descriptor [@chouapaac] and make class labels for classification modeling. ```{r, eval = FALSE} -# calculate APseAAC descriptors +# Calculate APseAAC descriptors x1 <- t(sapply(extracell, extractAPAAC)) x2 <- t(sapply(mitonchon, extractAPAAC)) x <- rbind(x1, x2) -# make class labels +# Make class labels labels <- as.factor(c(rep(0, length(extracell)), rep(1, length(mitonchon)))) ``` @@ -171,7 +174,7 @@ Next, we will split the data into a 75% training set and a 25% test set. ```{r, eval = FALSE} set.seed(1001) -# split training and test set +# Split training and test set tr.idx <- c( sample(1:nrow(x1), round(nrow(x1) * 0.75)), sample(nrow(x1) + 1:nrow(x2), round(nrow(x2) * 0.75)) @@ -214,10 +217,10 @@ and plot the ROC curve with the `pROC` package, as is shown in Figure 1. ```{r, eval = FALSE} -# predict on test set +# Predict on test set rf.pred <- predict(rf.fit, newdata = x.te, type = "prob")[, 1] -# plot ROC curve +# Plot ROC curve library("pROC") plot.roc(y.te, rf.pred, grid = TRUE, print.auc = TRUE) ``` @@ -1278,9 +1281,9 @@ amino acids in the sequence belongs to the 20 default types: ```{r, protcheck} x <- readFASTA(system.file("protseq/P00750.fasta", package = "protr"))[[1]] -# a real sequence +# A real sequence protcheck(x) -# an artificial sequence +# An artificial sequence protcheck(paste(x, "Z", sep = "")) ```