Help me get images from google news rss feed

I’m using the Google News RSS Feed to access news, but it lacks media content.

I attempted to extract images from the item link using the formula:
$wg(wg(gv(source), rss, 0, link), url, ".jpg",0)$
Unfortunately, it didn’t work.

Now, I’m trying to obtain images directly from the source by extracting the jpg link from the HTML head’s <meta property="og:image">. However, I’m stuck and unsure how to proceed. I need assistance, please.

If you know of a better method to retrieve the image or have suggestions, please share.

Note: I use Automate (Tasker Alternative App).

1 Like

Right now you can not extract meta properties using wg unless you use wg XML rather than wg URL, since “URL” is not parsing meta as of now, i need to add a new mode “meta” that will extract the proper meta resource from the head, i will try to add this in 3.75 since its not too difficult

Next version will allow you to do this:

$wg(wg(gv(source), rss, 0, link), jsoup, "meta[property=og:image]", content)$

This will

  • Get first RSS link and download it
  • Parse the web page using JSOUP
  • Use a JSoup selector to find a <meta> tag with a property set to og:image, it will then select the content value

Beta will be out soon.
Thanks for your idea!

1 Like

Would it be able to follow redirects? because google news redirects from a page to destination page. For example:

The rss link for the first article is:

https://news.google.com/rss/articles/CBMiYWh0dHBzOi8vd3d3LmNubi5jb20vMjAyNC8wMy8wNy9wb2xpdGljcy90YWtlYXdheXMtam9lLWJpZGVuLXN0YXRlLW9mLXRoZS11bmlvbi1hZGRyZXNzL2luZGV4Lmh0bWzSAWVodHRwczovL2FtcC5jbm4uY29tL2Nubi8yMDI0LzAzLzA3L3BvbGl0aWNzL3Rha2Vhd2F5cy1qb2UtYmlkZW4tc3RhdGUtb2YtdGhlLXVuaW9uLWFkZHJlc3MvaW5kZXguaHRtbA?oc=5

Then it redirects to:

https://edition.cnn.com/2024/03/07/politics/takeaways-joe-biden-state-of-the-union-address/index.html

Redirect should work. You can test it right now it’s available via manual download at

I’ve tried it, it failed to redirect to the destination site.

You can test it by using Google News US source

The problem is not Kustom not following redirects but google news is trying to use javascript to redirect when Kustom tries to fetch data using default user agent, a fix could be to force a different user agent (like curl) globally but this would break things, so to fix this please use a Flow instead, so:

  • Add a WebGet action, add an header with this inside User-Agent: curl/7.1.2
  • Add a formula action like $wg(#last, jsoup, …)$
1 Like

I created a flow like this but it keeps loading while executing the formula while testing in flow:

  1. Web Get:
    1.1 HTTP URL:
    https://news.google.com/rss?hl=en&gl=US&ceid=US:en
    1.2 HEADERS:
    User-Agent: curl/7.1.2

  2. Formula:
    $wg(wg(#last, rss, 0, link), jsoup, "meta[property=of:image]", content)$

  3. Set Global Var

Is my flow wrong?

This is my current flow:

And it works although on preview it doesn’t always do (i am investigating this), it currently sets the title of the article but you can change the property, what i do is:

  • First fetch with webget the RSS
  • Then i fetch the article with another webget action using $wg(#last, rss, 0, link)$ (and here and only here the user agent is important)
  • Finally i parse the metadata using $wg(#last, jsoup, "meta[property=og:title]", content)$ to get the title

You can just copy the below snippet then go to the “flows” tab and press paste icon

##KUSTOMCLIP##
{
  "clip_version": 1,
  "KUSTOM_FLOW": {
    "FInTgLTd": {
      "id": "FInTgLTd",
      "name": "Flow000",
      "t": [
        {
          "id": "TTJ9ZtW8",
          "type": "T_MANUAL"
        }
      ],
      "a": [
        {
          "id": "TAfeeO1Q",
          "type": "A_WGET",
          "params": {
            "headers": "User-Agent: curl/7.1.2",
            "uri": "https://news.google.com/rss?hl\u003den\u0026gl\u003dUS\u0026ceid\u003dUS:en"
          }
        },
        {
          "id": "TATnB0rq",
          "type": "A_WGET",
          "params": {
            "headers": "User-Agent: curl/7.1.2",
            "uri": "$wg(#last, rss, 0, link)$"
          }
        },
        {
          "id": "TAgGePFM",
          "type": "A_FORMULA",
          "params": {
            "formula": "$wg(#last, jsoup, \"meta[property\u003dog:title]\", content)$"
          }
        },
        {
          "id": "TAjZcRAS",
          "type": "A_GLOBAL",
          "params": {
            "global": "title"
          }
        }
      ]
    }
  }
}
##KUSTOMCLIP##
1 Like

I’ve tried this just now, but getting this error:

I tried outside preview too but nothing changed in the dedicated global text.

I am trying to understand if there is any issue here, the results are not consistent

This has been fixed in the next release, the flow is not much more consistent in downloading data when parsing

This topic was automatically closed 25 days after the last reply. New replies are no longer allowed.