You are here

Drupal Feed Importer Failures

Modern web site software - like Drupal - simply provides a web-based front end to a database. For a traditional web site, the front end retrieves articles and formats them on HTML pages. A more sophisticated web site allows more sophisticated access to the database.

If we embrace this vision, we build a new web site by importing databases. On Drupal, that means we use the Feed Importer. I have finally gotten the Feed Importer to work, after several hours of banging around.

Bug Fixing

The modern bug fixing process seems to be as follows:

  1. Try a few things. If they succeed, exit.
  2. Look up the error on Google, and use different explanations of what is happening to find different explanations of what might be going wrong.
  3. Use the explanations to develop a few ideas of new things to try.
  4. Go to Step 1 and continue.

I continued this process until, thanks to Google, I stumbled on the explanation.

To quote the reaction of a colleage with the same problem: "Oh dear lord."

The problem is that Drupal text processing does not recognize Macintosh end-of-line markings by default. You have to go into your site's "settings.php" (the source of all site-specific configuration) and tell it to detect a possibly-surprising line end marking:

ini_set('auto_detect_line_endings', 1);

So, let me make a list of Feed Importer Pitfalls and Gotchas.

Feed Importer CSV Node Importing Checklist

This checklist is for people who import nodes from CSV or similarly-formatted files. I'll let someone else explain the sublteties of importing feeds.

  • Enable "auto detect line endings" in your "settings.php" file, especially if any of your import files are produced on a Macintosh. Otherwise the importer won't import anything. This is explained above.
  • The safest way to start is to "clone" the example "Node Importer" and customize it to work with the data you're importing.
  • I generally create a custom content type when I import data. I try to assign some reasonable fields to the node's "title" and "body" elements. I'll omit the "body" if there's no legitimate field that applies to it.
  • Under the Importer's Basic Settings, the oddly-named "Attach Content Type" should be set to "Use stand-alone form." This lets you select a file to import.
  • Select "Import on Submission" so it imports the file immediately. I imported 117KB of tabbed data without relying on cron. 
  • Under the Node Processor Settings, be sure to set "Content Type" to the content type you're importing. If you created a custom content type to contain your data fields, select that type here (not under "Attach Content Type").
  • Under the Node Processor Mappings, enter the column headers from the input data, and map each one to a field in the content type you're importing.
  • At least one column should contain a unique field, preferably numerical. If you don't have one, simply add a column that individually numbers your rows of input data. You can call this the "GID" field. You don't need to add a separate, unique field in the content type - you simply map it to the existing "GUID" field.
  • Both the "CSV Import" and the "TAB Import" fail on certain special characters, especially single quote marks and extended ASCII characters. Look over the text strings in the CSV file before you import it, and add double quote marks around strings that contain such things. 

No doubt there are more pitfalls that I've missed. I should probably post this on the Drupal home site.

 

Wordpress tag: 
Post category: 

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer