Sarvap Praharanayuthan
Sarvap Praharanayuthan

Reputation: 4360

Which is the advised method of storing huge content?

I am planning to start an article based website, where the users will type their articles and upload the images.

Now I am bit confused, in what way I could save the data. Either in database or using the file system as a .txt file or .html file or in any other means. Saving the data in database is causes a little embarrassment for me because initially I plan to run the site in a shared server. So will the shared server capacity be enough for the huge content? Or is it advised so save the content as a separate .txt file or .html file?

Considerations:

  1. Search functionality to be employed only for the article title. And the article title will be saved in the database. Search functionality is not extended to the content of the article.
  2. I have planned to use a WYSIWYG editor and allow the contributors to format their content. So obviously the stored data will contain HTML codes. So storing the content in file system is safe because it XSS attack on the database, is this point true?
  3. Images will be stored in the file system, not in the database.

a. What are the points to be concentrated to prevent the XSS attack while doing this?

b. If storing in database is the advised solution what should be the datatype? TEXT or LONGTEXT?

Upvotes: 1

Views: 109

Answers (2)

AlexV
AlexV

Reputation: 23098

This is the 2 most common solutions I can think of:

  1. Store everything in the database.
  2. Store the "small" data in the database and all attachments (binary file such as JPEG and PDF) outside of the database on the filesystem.

Both solutions have advantages and drawbacks.

Solution #1: Store everything in the database

Advantages:

  • With some (powerful) databases you can even index (seach) the content of common file formats such as PDF (Oracle interMedia is an example).
  • You can easily ensure data integrity.
  • You can easily ensure data security.

Drawbacks:

  • Makes the database huge and can be painfully slow if you never do maintenance of your database/tables.
  • Can be harder to "browse" binary content for debugging.
  • You especially need to run database/table maintenance if your project have a huge database with many users reading/writing to it.
  • Database backups may be harder to do and restore.
  • Can be sometime tricky to serve file on web applications (need to know the MIME type to serve files correctly).

Solution #2 - Store the "small" data in the database and all attachments outside of the database on the filesystem

Advantages:

  • HTTP caching of the files are somewhat easier to do.
  • Easier to browse the files (for debug or anything).
  • Easier to maintain speedy system without doing anything special.

Drawbacks:

  • Need to create and maintain a relationship table who will link the files on the filesystem to the entities in the database.
  • Data integrity can't really be made (what happen if the file is deleted manually on the filesystem but is still present in the database?).
  • Security must be ensured on many levels.

This is a quick overview of what I can think of. Both solutions can be great, it really depends on how many users will use you project and what hardware is available to you.

For a shared environment I would probably go with #2 since a shared environment is usually not really powerful.

Upvotes: 2

Iqbal Malik
Iqbal Malik

Reputation: 602

I currently faced the same issue. I have millions of profiles and every profiles contains huge data itself. Storing huge data in relational database is not recommended because it slows down site performance. I recommend this solution.

  1. Store data in database which is necessary for searching and initially required for the website. e.g ArticleTitle, tags.

  2. Use NoSQL database (CouchDB) which contains the all information regarding an article. While saving documents in CouchDB, make article id as the name of document so that you can easly map article ids to the article documents.

Upvotes: 1

Related Questions