Skip to content

visdat package #87

@njtierney

Description

@njtierney

Summary

  • What does this package do? (explain in 50 words or less)

visdat visualises R data frames so that you can quickly identify data structure and data types. This makes it easier to "get a look at the data" and visually identify abnormalities with a dataset.

  • Paste the full DESCRIPTION file inside a code block below.
Package: visdat
Title: Preliminary Visualisation of Data
Version: 0.0.5.9000
Authors@R: person("Nicholas", "Tierney", email = "nicholas.tierney@gmail.com", role = c("aut", "cre"))
Description: visdat makes it easy to visualise your whole dataset so that you
    can visually identify problems.
Depends:
    R (>= 3.2.2)
License: MIT + file LICENSE
LazyData: true
RoxygenNote: 5.0.1.9000
Imports:
    ggplot2,
    tidyr,
    dplyr,
    purrr,
    readr,
    plotly (>= 4.5.6),
    magrittr,
    stats
URL: https://njtierney.com/visdat
BugReports: https://github.com/njtierney/visdat/issues
Suggests:
    testthat,
    knitr,
    rmarkdown,
    vdiffr
VignetteBuilder: knitr

  • URL for the package (the development repository, not a stylized html page)

https://github.com/njtierney/visdat

  • Who is the target audience?

R users who want to explore their data, particularly when they first receive it.

  • Are there other R packages that accomplish the same thing? If so, what is different about yours?

In terms of visualising missing data as a heatmap, there are a few other packages that have worked on this. The mi package used to have a visualisation method for missing data, missing.pattern.plot - however this is no longer present in the latest versions. The Amelia package has missmap, but the default requires some more work to make the final output easier to read.

The VIM package provides visualisations for missing data, for example, the aggr function provides a histogram of the missingness present in each variable.

In terms of visualising the types of data in a dataset, the wakefield package provides the table_heat function for visualising column data types.

But what makes visdat different?

visdat adheres to the principle that R packages should try to do one thing, it is a simple package that specialises in visualisation of data frames. Amelia and mi focus on multiple imputation and missing data methods. VIM focusses on visualising missingness and imputation in data, and the wakefield package focusses on creating random, reproducible data.

The functionality in visualising missing data for these packages is not the main focus, and so I argue that because visdat is purely about visualising dataframes, it gives it greater scope to work on just one thing.

Requirements

Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • has a CRAN and OSI accepted license.
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a vignette with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration with Travis CI and/or another service.

Publication options

  • Do you intend for this package to go on CRAN?
  • Do you wish to automatically submit to the Journal of Open Source Software? If so:
    • The package contains a paper.md with a high-level description.
    • The package is deposited in a long-term repository with the DOI:

Detail

  • Does R CMD check (or devtools::check()) succeed? Paste and describe any errors or warnings:

It succeeds, but there are some notes.

checking R code for possible problems ... NOTE
vis_guess: no visible binding for global variable ‘valueGuess’
Undefined global functions or variables:
  valueGuess

It does not yet have an rOpenSci footer image

I have not set up pre-commit hooks to ensure that README.md is always newer than README.Rmd, as I'm not sure what devtools::use_git_hook does?

I have not added #' @noRd to internal functions as I think it is still useful to have them documented, but I can change this if need be.

  • If this is a resubmission following rejection, please explain the change in circumstances:

  • If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:

  • Maëlle Salmon (@masalmon)

  • Jenny Bryan (@jennybc)

  • Andrew MacDonald (@aammd)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions