GSA (Google Search Appliance) Metadata Search


  1. HTML <meta> tags are included as metadata
<html>
<head>
<meta name="generator" content="my testing">
<meta name="category" content="private">
<meta name="zipcode" content="10001">
<meta name="publishdate" content="03.10.2013">
<titile> News on 2013-03-10</title>
</head>
<body> …. 
         Date :: : 10 Mar 2013
</body>
</html>
  1. You can search just inside the metadata as below
    1. Meta tag existence  :::   inmeta:generator  ( This returns all documents with generator meta tag)
    2. Partial value search:::  inmeta:generator~my (This returns all documents with generator meta tag with value "my")
    3. Exact value search :::  inmeta:category=private ( Now this returns documents with exact match like  category=private )
    4. Range based search ::: inmeta:zipcode:10000..10005 ( This return documents with zipcodes between 10000 and 10005)
    5. Date Range Search::: daterange:    ( This returns the documents with date falls under the range).
    6. Logical AND operator ::: No query term like "AND" but by default documents that contain ALL query terms are returned
    7. Logical OR operator :::  inmeta:category=private OR inmeta:zipcode=10001..10005


inmeta Syntax
Search Parameter Syntax
Description
inmeta: [meta tag]
&requiredfields=[meta tag name]
Returns results that contain the specified meta tag.
inmeta: [meta tag name]~[meta tag content]
&partialfields=[meta tag name]:[meta tag content]
Returns results that have the specified meta tag with a value that matches some or all of the specified meta tag content.
inmeta: [meta tag name]=[meta tag content]
&requiredfields=[meta tag name]:[meta tag content]
Returns only results that match the exact meta tag content value specified.



  1. By default GSA does the partial lookup in metadata associated with documents also.
  2. Limits on metadata
    1. No of meta tags :: no limit
    2. Total bytes for all tags :: 300k (There is no direct limit on the maximum number of bytes of meta data returned with each search result. However, meta tags and snippets beyond the first 300 KB of the document are not displayed or returned.)
    3. Total bytes per tag :: 1500

NOTE: GSA has metadat search capability for a long time using the requiredfields and partialfields parameters as part of the search API protocol. The inmeta operator allows search users to issue partialfields and requiredfields type searches directly from the search box (in the q= parameter).

Resources

No comments:

Post a Comment