GSA (Google Search
Appliance) is Google's enterprise search product. Recently I had to work with
GSA as search solution in the IBM Websphere portal platform . Main use cases are crawling the WCM seeds
(including the binary documents and WCM content) and portal content.
Checkout below for
more details regarding the GSA Basics
There are two simple
ways we can feed the portal/WCM content to GSA
- Writing proxy component
- Get the portal /WCM seedlists from the IBM system (using the IBM out of the box seedlist framework)
- Parse IBM out of the box seedlist content and
- Generate the GSA compatible seedlist
- Post the GSA compatible seedlist to GSA server
- Generating the GSA compatible Feed directly
- Write custom component using the IBM portal/WCM API
- Generate the feed in GSA supported format
- Post feed to GSA
Thank you for your interesting article Siva. Sadly they discontinued the GSA, here is a petition to extend support for the GSA: https://www.change.org/p/google-inc-extend-support-for-the-google-search-appliance-gsa-don-t-kill-it-in-2018
ReplyDelete