-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prepare for a new version #145
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #145 +/- ##
===========================================
- Coverage 83.33% 63.88% -19.45%
===========================================
Files 3 4 +1
Lines 24 36 +12
===========================================
+ Hits 20 23 +3
- Misses 4 13 +9 ☔ View full report in Codecov by Sentry. |
Co-authored-by: jbrea <jbrea@users.noreply.github.com>
* Streamline adding a new dataset * Add instructions to README for adding a new dataset * Add scripts to update the dataset metadata * Add update_doc method to only add a single dataset * Add HTML documentation generation to update_doc * Change update_doc to correctly round trip quotes in the metadata CSV * Sort datasets CSV * Allow datasets with a .RData extension as well as .rda --------- Co-authored-by: Frankie Robertson <frankie@robertson.name>
This allows them to be displayed in a much better way in the REPL.
The changes look to make sense. I left one comment. I am not a maintainer of this package (and I do not know its internals). Maybe @nalimilan knows who has appropriate knowledge of the internals to approve it. Thank you for working on it. |
Appreciate everyone's work on this package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay! I have a few comments.
A type to hold the content of a dataset description. | ||
|
||
The main purpose of its existence is to provide a way to display the content | ||
differently in HTML and markdown contexts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
differently in HTML and markdown contexts. | |
differently in HTML and Markdown contexts. |
|
||
!!! note Unexported | ||
This function is left deliberately unexported, since the name is pretty common. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a standard pattern AFAIK. Better mark the function as public via @compat public description
at the same place as exports. This is available since Compat 3.47.0 and 4.10.0. Could also add packages
to that list BTW.
!!! note Unexported | |
This function is left deliberately unexported, since the name is pretty common. |
|
||
Invoke this function in exactly the same way you would invoke `dataset` to get the dataset itself. | ||
|
||
This object prints well in the REPL, and can also be shown as markdown or HTML. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This object prints well in the REPL, and can also be shown as markdown or HTML. | |
This object prints well in the REPL, and can also be shown as Markdown or HTML. |
RDatasets.description(package_name::AbstractString, dataset_name::AbstractString) | ||
RDatasets.description(df::DataFrame) # only call this on dataframes from RDatasets! | ||
|
||
Returns an `RDatasetDescription` object containing the description of the dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returns an `RDatasetDescription` object containing the description of the dataset. | |
Return an `RDatasetDescription` object containing the description of the dataset. |
|
||
""" | ||
RDatasets.description(package_name::AbstractString, dataset_name::AbstractString) | ||
RDatasets.description(df::DataFrame) # only call this on dataframes from RDatasets! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put this information in the docstring body instead. Also say what happens if that's not the case.
RDatasets.description(df::DataFrame) # only call this on dataframes from RDatasets! | |
RDatasets.description(df::DataFrame) |
error("Unable to locate dataset file $rdaname or $csvname") | ||
end | ||
# Finally, inject metadata into the dataframe to indicate origin: | ||
DataFrames.metadata!(dataset, "RDatasets.jl", (string(package_name), string(dataset_name))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed AFAICT:
DataFrames.metadata!(dataset, "RDatasets.jl", (string(package_name), string(dataset_name))) | |
metadata!(dataset, "RDatasets.jl", (string(package_name), string(dataset_name))) |
The main purpose of its existence is to provide a way to display the content | ||
differently in HTML and markdown contexts. | ||
|
||
Invoked through [`RDatasets.description`](@ref). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invoked through [`RDatasets.description`](@ref). | |
Obtained through [`RDatasets.description`](@ref). |
if "RDatasets.jl" in DataFrames.metadatakeys(df) | ||
package_name, dataset_name = DataFrames.metadata(df, "RDatasets.jl") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if "RDatasets.jl" in DataFrames.metadatakeys(df) | |
package_name, dataset_name = DataFrames.metadata(df, "RDatasets.jl") | |
if "RDatasets.jl" in metadatakeys(df) | |
package_name, dataset_name = metadata(df, "RDatasets.jl") |
@@ -1,12 +1,13 @@ | |||
name = "RDatasets" | |||
uuid = "ce6b1742-4840-55fa-b093-852dadbb1d8b" | |||
version = "0.7.7" | |||
version = "0.8.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a good occasion to tag 1.0.0. Clearly the package is stable enough.
version = "0.8.0" | |
version = "1.0.0" |
RDatasets.description(iris) # only use this on DataFrames returned from `dataset`! | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RDatasets.description(iris) # only use this on DataFrames returned from `dataset`! | |
``` | |
RDatasets.description(iris) | |
``` | |
Only use the latter on data frames returned from `dataset`. | |
This is a combined PR for a bunch of different PRs that are currently up. Below is a summary of changes:
dataset
, indicating that the dataframe was generated by RDatasets.jl and mentioning its package and dataset name as a Tuple. This is essentially a callDataFrames.metadata!(df, "RDatasets.jl" => (package_name, dataset_name))
.description
function to RDatasets, make it readable in the REPLPRs #135 from @frankier and #124 from @jbrea are incorporated here.