Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tv_grab_fr so it works again #247

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

fgouget
Copy link
Contributor

@fgouget fgouget commented Oct 31, 2024

This pull request contains the following changes:

  • tv_grab_fr stopped working a few months ago. The reason is that whereas the full TV listings used to be available on both the telestar.fr and telepoche.fr websites, now the telestar.fr website has only a limited subset of the TV listings and in a different format from that expected by tv_grab_fr. The full TV listings are now only available on the telepoche.fr website.
  • This pull request also improves handling of gaps between programmes: when there are multiple episodes of the same series back to back the source often only has the first episode start time and duration! That can result in pretty large gaps so it's best to ignore them, at least during daytime.
  • Finally this adds a note to warn users that Télé Poche does not have programme data between midnight and 5 am (i.e. programmes that start during that period).

With these changes I can again get TV listings in MythTV on Debian 12.

Télé Star and Télé Poche have the same operator and very similar
content but nowadays only the Télé Poche website has the full TV
listings.
When there are multiple episodes of the same series back to back the
source often only has the first episode start time and duration! So
ignore even large gaps except in the middle of the night where it's more
likely that there is no programme.
Télé Poche (and Télé Star) usually don't have the programmes that run
between midnight and 5 am.
@fgouget
Copy link
Contributor Author

fgouget commented Nov 2, 2024

The 4 CI failures are all because the test containers are too old or have problems and thus are not related to this pull request.

@honir
Copy link
Contributor

honir commented Nov 11, 2024

It appears the Télépoche terms of use prevent downloading data from their website.

We respect publisher's copyright rules, so this change is not something we can officially endorse.

@fgouget
Copy link
Contributor Author

fgouget commented Nov 18, 2024

I checked the telepoche.fr CGUs and I guess the only relevant part is section 6.1, paragraphs 4. Is that what you mean?

However the grabber currently uses data from telestar.fr which has the same CGU as telepoche (down to the PDF MD5 checksum). So doesn't that mean that the grabber should be removed entirely?

  1. It is totally broken in its current state.
  2. If you are right about the CGUs it cannot be fixed while still using this source.

@honir
Copy link
Contributor

honir commented Nov 19, 2024

You are probably right. I expect it could be removed in the next release.

I'm afraid the days of screen-scraping data from tv guide websites is pretty much at an end. Even where the Ts&Cs allow copying, the webpage is often so complex that scraping is unreliable (Télépoche is an exception as it still uses an old-school design). A few websites provide JSON or XML data and they are to be applauded for their public service :-)

@honir honir added the broken label Nov 19, 2024
@garybuhrmaster
Copy link
Contributor

I checked the telepoche.fr CGUs and I guess the only relevant part is section 6.1, paragraphs 4. Is that what you mean?

However the grabber currently uses data from telestar.fr which has the same CGU as telepoche (down to the PDF MD5 checksum). So doesn't that mean that the grabber should be removed entirely?

  1. It is totally broken in its current state.
  2. If you are right about the CGUs it cannot be fixed while still using this source.

I wonder if between the time of the various initial implementations, and today, the various terms and conditions on the use of the data has changed. I don't think anyone goes back and checks for such changes over time except when the grabber fails in some way and a review of the new site and sources are performed. Protecting ones intellectual property (the results of the collection and curation of the guide data) is now a common expectation of those organizations (especially after various reorganizations, mergers and acquisitions), and updated T&Cs is what happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants