Prepare for a new version #145

asinghvi17 · 2024-04-02T12:33:58Z

This is a combined PR for a bunch of different PRs that are currently up. Below is a summary of changes:

Add metadata to the dataframe returned by dataset, indicating that the dataframe was generated by RDatasets.jl and mentioning its package and dataset name as a Tuple. This is essentially a call DataFrames.metadata!(df, "RDatasets.jl" => (package_name, dataset_name)).
Add a description function to RDatasets, make it readable in the REPL
- Make this function discoverable, document it.
Bump RData.jl compat to 1.
Add instructions for data addition and improve data addition script
Bump version to v0.8

PRs #135 from @frankier and #124 from @jbrea are incorporated here.

codecov-commenter · 2024-04-02T12:37:19Z

Codecov Report

Attention: Patch coverage is 18.18182% with 9 lines in your changes are missing coverage. Please review.

Project coverage is 63.88%. Comparing base (b1a5959) to head (4aac673).

Files	Patch %	Lines
src/dataset.jl	18.18%	9 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           master     #145       +/-   ##
===========================================
- Coverage   83.33%   63.88%   -19.45%     
===========================================
  Files           3        4        +1     
  Lines          24       36       +12     
===========================================
+ Hits           20       23        +3     
- Misses          4       13        +9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Co-authored-by: jbrea <jbrea@users.noreply.github.com>

* Streamline adding a new dataset * Add instructions to README for adding a new dataset * Add scripts to update the dataset metadata * Add update_doc method to only add a single dataset * Add HTML documentation generation to update_doc * Change update_doc to correctly round trip quotes in the metadata CSV * Sort datasets CSV * Allow datasets with a .RData extension as well as .rda --------- Co-authored-by: Frankie Robertson <frankie@robertson.name>

This allows them to be displayed in a much better way in the REPL.

Project.toml

bkamins · 2024-04-04T18:42:53Z

The changes look to make sense. I left one comment. I am not a maintainer of this package (and I do not know its internals). Maybe @nalimilan knows who has appropriate knowledge of the internals to approve it. Thank you for working on it.

kdpsingh · 2025-01-11T00:09:21Z

Appreciate everyone's work on this package.

nalimilan

Sorry for the delay! I have a few comments.

nalimilan · 2025-01-26T21:41:50Z

src/dataset.jl

+A type to hold the content of a dataset description.
+
+The main purpose of its existence is to provide a way to display the content
+differently in HTML and markdown contexts.


Suggested change

differently in HTML and markdown contexts.

differently in HTML and Markdown contexts.

nalimilan · 2025-01-26T21:45:25Z

src/dataset.jl

+
+!!! note Unexported
+    This function is left deliberately unexported, since the name is pretty common.


This isn't a standard pattern AFAIK. Better mark the function as public via @compat public description at the same place as exports. This is available since Compat 3.47.0 and 4.10.0. Could also add packages to that list BTW.

Suggested change

!!! note Unexported

This function is left deliberately unexported, since the name is pretty common.

nalimilan · 2025-01-26T21:45:51Z

src/dataset.jl

+
+Invoke this function in exactly the same way you would invoke `dataset` to get the dataset itself.
+
+This object prints well in the REPL, and can also be shown as markdown or HTML.


Suggested change

This object prints well in the REPL, and can also be shown as markdown or HTML.

This object prints well in the REPL, and can also be shown as Markdown or HTML.

nalimilan · 2025-01-26T21:45:58Z

src/dataset.jl

+    RDatasets.description(package_name::AbstractString, dataset_name::AbstractString)
+    RDatasets.description(df::DataFrame) # only call this on dataframes from RDatasets!
+
+Returns an `RDatasetDescription` object containing the description of the dataset.


Suggested change

Returns an `RDatasetDescription` object containing the description of the dataset.

Return an `RDatasetDescription` object containing the description of the dataset.

nalimilan · 2025-01-26T21:47:11Z

src/dataset.jl

+
+"""
+    RDatasets.description(package_name::AbstractString, dataset_name::AbstractString)
+    RDatasets.description(df::DataFrame) # only call this on dataframes from RDatasets!


Put this information in the docstring body instead. Also say what happens if that's not the case.

Suggested change

RDatasets.description(df::DataFrame) # only call this on dataframes from RDatasets!

RDatasets.description(df::DataFrame)

nalimilan · 2025-01-26T21:58:16Z

src/dataset.jl

+        error("Unable to locate dataset file $rdaname or $csvname")
+    end
+    # Finally, inject metadata into the dataframe to indicate origin:
+    DataFrames.metadata!(dataset, "RDatasets.jl", (string(package_name), string(dataset_name)))


Not needed AFAICT:

Suggested change

DataFrames.metadata!(dataset, "RDatasets.jl", (string(package_name), string(dataset_name)))

metadata!(dataset, "RDatasets.jl", (string(package_name), string(dataset_name)))

nalimilan · 2025-01-26T21:59:34Z

src/dataset.jl

+The main purpose of its existence is to provide a way to display the content
+differently in HTML and markdown contexts.
+
+Invoked through [`RDatasets.description`](@ref).


Suggested change

Invoked through [`RDatasets.description`](@ref).

Obtained through [`RDatasets.description`](@ref).

nalimilan · 2025-01-26T22:00:53Z

src/dataset.jl

+    if "RDatasets.jl" in DataFrames.metadatakeys(df)
+        package_name, dataset_name = DataFrames.metadata(df, "RDatasets.jl")


Suggested change

if "RDatasets.jl" in DataFrames.metadatakeys(df)

package_name, dataset_name = DataFrames.metadata(df, "RDatasets.jl")

if "RDatasets.jl" in metadatakeys(df)

package_name, dataset_name = metadata(df, "RDatasets.jl")

nalimilan · 2025-01-26T22:02:18Z

Project.toml

@@ -1,12 +1,13 @@
 name = "RDatasets"
 uuid = "ce6b1742-4840-55fa-b093-852dadbb1d8b"
-version = "0.7.7"
+version = "0.8.0"


Probably a good occasion to tag 1.0.0. Clearly the package is stable enough.

Suggested change

version = "0.8.0"

version = "1.0.0"

nalimilan · 2025-01-26T22:05:14Z

README.md

+RDatasets.description(iris) # only use this on DataFrames returned from `dataset`!
+```


Suggested change

RDatasets.description(iris) # only use this on DataFrames returned from `dataset`!

```

RDatasets.description(iris)

```

Only use the latter on data frames returned from `dataset`.

CompatHelper: bump compat for RData to 1, (keep existing compat)

8835c68

asinghvi17 and others added 6 commits April 2, 2024 08:44

Show descriptions of data sets. (#146)

56d065c

Co-authored-by: jbrea <jbrea@users.noreply.github.com>

Add a regex-based HTML to Markdown rewriter for docs

4bdf2a2

This allows them to be displayed in a much better way in the REPL.

More and better docs for description

1513803

Inject metadata into all DataFrames indicating origin from RDatasets

ec63f2a

Document description in the README

85dae81

asinghvi17 requested a review from bkamins April 2, 2024 13:46

asinghvi17 added 2 commits April 2, 2024 09:47

Make description a bit more robust

9e59d00

Bump VERSION

05f2748

asinghvi17 marked this pull request as ready for review April 2, 2024 13:52

bkamins reviewed Apr 4, 2024

View reviewed changes

Project.toml Outdated Show resolved Hide resolved

Update Project.toml

30ad0b0

asinghvi17 requested a review from nalimilan April 5, 2024 12:02

asinghvi17 mentioned this pull request Jan 10, 2025

CompatHelper: bump compat for RData to 1, (keep existing compat) #142

Open

nalimilan reviewed Jan 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prepare for a new version #145

Prepare for a new version #145

asinghvi17 commented Apr 2, 2024 •

edited

Loading

codecov-commenter commented Apr 2, 2024 •

edited

Loading

bkamins commented Apr 4, 2024

kdpsingh commented Jan 11, 2025

nalimilan left a comment

nalimilan Jan 26, 2025

nalimilan Jan 26, 2025

nalimilan Jan 26, 2025

nalimilan Jan 26, 2025

nalimilan Jan 26, 2025

nalimilan Jan 26, 2025

nalimilan Jan 26, 2025

nalimilan Jan 26, 2025

nalimilan Jan 26, 2025

nalimilan Jan 26, 2025

	differently in HTML and markdown contexts.
	differently in HTML and Markdown contexts.


		!!! note Unexported
		This function is left deliberately unexported, since the name is pretty common.


		Invoke this function in exactly the same way you would invoke `dataset` to get the dataset itself.

		This object prints well in the REPL, and can also be shown as markdown or HTML.

	Returns an `RDatasetDescription` object containing the description of the dataset.
	Return an `RDatasetDescription` object containing the description of the dataset.

	RDatasets.description(df::DataFrame) # only call this on dataframes from RDatasets!
	RDatasets.description(df::DataFrame)

	DataFrames.metadata!(dataset, "RDatasets.jl", (string(package_name), string(dataset_name)))
	metadata!(dataset, "RDatasets.jl", (string(package_name), string(dataset_name)))

	Invoked through [`RDatasets.description`](@ref).
	Obtained through [`RDatasets.description`](@ref).

		if "RDatasets.jl" in DataFrames.metadatakeys(df)
		package_name, dataset_name = DataFrames.metadata(df, "RDatasets.jl")

		RDatasets.description(iris) # only use this on DataFrames returned from `dataset`!
		```

Prepare for a new version #145

Are you sure you want to change the base?

Prepare for a new version #145

Conversation

asinghvi17 commented Apr 2, 2024 • edited Loading

codecov-commenter commented Apr 2, 2024 • edited Loading

Codecov Report

bkamins commented Apr 4, 2024

kdpsingh commented Jan 11, 2025

nalimilan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asinghvi17 commented Apr 2, 2024 •

edited

Loading

codecov-commenter commented Apr 2, 2024 •

edited

Loading