Reputation: 1782
I have imported some pages from a site on the internet, to my plone site. The problem right now, is that when imported to plone, it changes the url, which results in plone not beeing able to locate the files. So before the import, here is an example of how one of the urls looks:
http://wiki.scandiatransplant.org/?What_Is_Scandiatransplant
And after the import, it looks like this:
http://localhost:8080/Scandiatransplant/wiki/index.html?What_Is_Scandiatransplant
Obviously this is a problem, as there is no option called index.html? Is there a way to solve this? I am thinking maybe it could be fixed by adding a step in the pipeline.cfg file that tells not to change the url? This is just a guess though. And I haven't made a pipeline.cfg yet. The site: http://plone.org/products/funnelweb/#using-a-local-pipeline-configuration explains that one could make a pipeline.cfg file, but it doesn't tell where to place that file. Where should I place this file?
And finally... it can be expressed through regular expressions which files to ignore during the import, but I have not told funnelweb to ignore any files. Still it doesn't import the images, pdf file, xslt etc. Has anyone experienced this as well?
So, to summarize my questions.
Where should I place the pipeline.cfg file?
How do I make plone/funnel not to change the url, but keep the same url from the import?
How do I make funnelweb import the images and pdf files as well?
Upvotes: 2
Views: 90
Reputation: 1123420
You can put the pipeline.cfg file anywhere; you tell funnelweb
where to find it from the command line:
bin/funnelweb --pipeline=path/to/your/pipeline.cfg
This is more complex. Your target site is a Wiki, and the page names are part of the query string there. The ?What_Is_Scandiatransplant
needs to be used as the id of the new Plone page, and URLs used in other pages need to be rewritten to match.
You can certainly do that in the pipeline, but is a little more complex than can easily be written up here. Follow the documentation for funnelweb
(the urltidy
component will help rewrite URLs), and feel free to ask specific questions about problems you encounter here on SO.
Check the logs and see what funnelweb already finds and uploads. You may have to adjust the webcrawler settings; this varies from site to site. Without more details about the site I can only give you this general hint.
Upvotes: 3