March 12, 2023 - Dave Bullock / eecue

Home › Feed › 2023 › March › 12

← February 2023 890 photos

March 2023

7 photos

April 2023 → 471 photos

Browse Archive (305 months)

March 12, 2023 8 items

Blog Posts

Reverse Engineering Read Later Data from the Apple News App

As we navigate the digital world, we often come across articles we don't have time to read but still want to save for later. One way to accomplish this is by using the Read Later feature in Apple News. But what if you want to access those articles outside the Apple News app, such as on a different device or with someone who doesn't use Apple News? Or what if you want to automatically post links to those articles on your blog? That's where the nerd powers come in. [![](https://eecue.com/img/3840/1cdda7e8fa.jpg)](https://eecue.com/photo/1cdda7e8fa) ## Reverse Engineering the Data Initially, I reached out to [Rhet Turnbull](https://github.com/RhetTbull), the creator of the amazing [osxphotos](https://github.com/RhetTbull/osxphotos) app/Python library that I use to [extract the data from Apple Photos](https://eecue.com/blog/extracting-data-from-apple-photos--the-power-of-organization). I use that data to [power the photo section](https://eecue.com/indexes/tags-s) of my site. I asked Rhet if he had ever pulled this data from News. While I waited to hear back from him, I used `lsof` to look for the file that Apple News uses to store Read Later Articles. I discovered that Apple News uses a Binary PList file located in a super obvious place: > /Users/eecue/Library/Containers/com.apple.news/Data/Library/Application Support/com.apple.news/com.apple.news.public-com.apple.news.private-production/reading-list Simple and obvious, right?! After I found it, I noticed it was in a strange format that a normal binary PList parser couldn’t understand. However, I was able to just run `strings` on the file and extract the Apple News Article ID which looks like this: [https://apple.news/AbtWOAgVqToW62MeeZ1xkcQ](https://apple.news/AbtWOAgVqToW62MeeZ1xkcQ). I wrote a script to parse the data on the page above and then use [Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/) to extract the article data. It wasn’t perfect, but it did the job: ```python import subprocess import requests from bs4 import BeautifulSoup # Run the `strings` command to extract the strings from the binary file proc = subprocess.Popen(['strings', '/Users/eecue/Library/Containers/com.apple.news/Data/Library/Application Support/com.apple.news/com.apple.news.public-com.apple.news.private-production/reading-list'], stdout=subprocess.PIPE) # Loop through the output and look for article IDs article_ids = [] for line in proc.stdout: # Check if the line starts with "rl-" and ends with "_" if line.startswith(b'rl-'): # Extract the article ID by removing the "rl-" prefix and "_" suffix article_id = line.decode().strip()[3:] if article_id.endswith('_'): article_id = article_id[:-1] article_ids.append(article_id) def extract_info_from_apple_news(news_id): # Construct the Apple News URL from the ID apple_news_url = f'https://apple.news/{news_id}'

Photos (7)

Man and dog share a happy moment on the couch

← February 2023 April 2023 →