Drwhite
Drwhite

Reputation: 1695

How to index feed in Google Search Appliance?

i have my atom (list of continent as xml) at this url .../continent/search?view=atom like this:

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">
    <title>List of all continents</title>
    <opensearch:totalResults>{{ continents_length }}</opensearch:totalResults>
    <opensearch:startIndex>{{ continents.start_index }}</opensearch:startIndex>
    <opensearch:itemsPerPage>{{ count }}</opensearch:itemsPerPage>
    <opensearch:Query continent="request" searchTerms="" startPage="{{ continents.start_index }}" />
    <author><name>My_site</name></author>
    <id>urn:domain-id:mysite.com:continent</id>

    <link rel="self" href="{{ url }}" />
    {% for continent in continents %}
    <entry>
        <span class="continent_id">{{ continent.continent_id }}</span>
        <span class="continent_name">{{ continent.continent_name }}</span>
        <span class="list_countries">{{ continent.list_countries }}</span>
    </entry>
    {% endfor %}
</feed>

When i want to PUT and index my feed in gsa-interface i have used this:

<?xml version="1.0" encoding="utf-8"?>
<entry xmlns="http://www.w3.org/2005/Atom">
    <title>continent</title>
    <id>urn:domain-id:mysite.com:continent</id>
    <author>
        <name>admin user</name>
    </author>
    <link rel="self" href=".../feed/continent"/>
    <content type="xhtml">
        <div xmlns="http://www.w3.org/1999/xhtml">
            <span id="refresh-each">15 12,14,18 * * *</span>
            <span id="gsa-datasource">continent</span>
            <span id="gsa-feedtype">full</span>
            <span id="url">...continent/search?view=atom</span>
            <span id="opensearch-pattern">&amp;count=100&amp;startPage=%STARTPAGE%</span>
            <ul class="connection">
                <li id="userid">user</li>
                <li id="password">pass</li>
            </ul>
            <ul id="metadata">
                <li id="continent_id">atom:entry/xhtml:span[@class='continent_id']</li>
                <li id="continent_name">atom:entry/xhtml:span[@class='continent_name']</li>
                <li id="list_countries">atom:entry/xhtml:span[@class='list_countries']</li>
            </ul>
            <div id="xsl-content">
                <![CDATA[
                    <xsl:stylesheet version="1.0"
                        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                        xmlns:atom="http://www.w3.org/2005/Atom"
                        xmlns:xhtml="http://www.w3.org/1999/xhtml"
                        exclude-result-prefixes="atom xhtml">

                        <xsl:template name="FormatDescription">
                            <xsl:param name="name"/>
                            <xsl:value-of select="$name"/>
                        </xsl:template>

                        <xsl:template match="atom:entry">
                            <html>
                                <body>
                                    <xsl:apply-templates select="atom:entry/xhtml:span" />
                                </body>
                            </html>
                        </xsl:template>
                        <xsl:template match="atom:entry/xhtml:span">
                            <xsl:copy-of select="*"/>
                        </xsl:template>

                    </xsl:stylesheet>
                ]]>
            </div>
        </div>
    </content>
</entry>

But when i check the flux of transfered files it return 0 file with error:

ProcessNode: Missing required attribute url. skipping element., skipping record

For the second indexation and the tird one, there is no error, and neither no file !

200 OK Feed continent has been pushed successfully to the Google Search Appliance.

Any suggestion/recommandation ?

Upvotes: 0

Views: 1078

Answers (2)

avirr
avirr

Reputation: 668

Following up on this, the word "feed" is confusing here. A GSA content feed is not like any RSS or Atom feeds. Here's a simplified example (from the documentation):

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE gsafeed PUBLIC "-//Google//DTD GSA Feeds//EN" "">
<gsafeed>
   <header>
     <datasource>hello</datasource>
     <feedtype>incremental</feedtype>
   </header>
  <record url="http://www.corp.enterprise.com/hello02" mimetype="text/plain">
    <content>UPDATED - This is hello02</content>
  </record>
</group>
</gsafeed>

As you can see, that's a very specific XML format, not shared with web site update feed formats. The documentation for this is good: http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/feedsguide/feedsguide.html

Upvotes: 2

BigMikeW
BigMikeW

Reputation: 831

Check out the feed documentation

You need to pass the GSA a feed XML file, not an atom feed.

Upvotes: 1

Related Questions