Posts Tagged SQL
There are quite a few ways that source code and other preformatted text can get mangled in WordPress, such as a rich-text editor, or even by the export/import process.
When I speak of ‘mangled’, I mean unnecessarily HTML-encoded. Left and right angle brackets (the < and > signs), quotation marks (the ” symbol), and ampersands (the & symbol) get HTML-encoded to <, &rt;, ", and &. This can seriously mangle sourcecode, as well as making other text seriously ugly.
It can become a huge time waste to try to go back and edit all your affected posts one at a time to fix mangled code, even with search-and-replace.
Instead, there’s an easy and straight-forward way to do it in SQL, that you can execute from phpMyAdmin, MySQL Workbench, or whatever you fancy.
Simply run each of the following queries on your WordPress database, and feel free to edit them as you like.
update wp_posts set post_content = replace(post_content, '"', '"'); update wp_posts set post_content = replace(post_content, '>', '>'); update wp_posts set post_content = replace(post_content, '<', '<');
Have any others to suggest? Please feel free to do so in the comments below. Thanks!
If you have a lot of WordPress posts and might want to find all posts containing a certain keyword for any reason, you can start by using the following SQL code, which was taken from this post. I used this in phpMyAdmin for a MySQL database. Make sure you are in the correct database first!
You can substitute any keyword for ‘needle’ below, but you must have the single-quotes and percent signs around it.
SELECT ID FROM wp_posts WHERE post_content LIKE '%needle%';
Example: Let’s say you use the NG Gallery plugin, which has you add a tag to all your posts to include said gallery. Now you find to find all posts which have that NG Gallery tag in them. The following query would work:
SELECT ID FROM wp_posts WHERE post_content LIKE '%nggallery%';
This can also be built upon for find-and-replace operations.
Keep in mind, you can really muck things up using SQL. Make a backup first if you don’t know what you’re doing!
Questions, comments, and feedback are always welcome. Thanks!
In my quest for the perfect “in my dreams” backup solution for my Ubuntu VPS, I created this very simple script which can be run as a cron job and can be easily modified to backup any amount of data to any remote FTP or SFTP server.
You could very easily include a database backup by running mysqldump beforehand, but I’m not including it in this script.
This required yafc to be installed, but Ubuntu installations can easily install it by running
sudo apt-get install yafc
And now, for the script:
#!/bin/bash # format of the open command is proto://username:password@HOSTorIP/ # proto is either ftp or ssh # special characters in the username or password are not well tolerated # anything in the EOF tags are direct commands to yafc. Test if unsure DIR=`date +%F` yafc <<EOF open ftp://username:email@example.com/ cd backup-dir mkdir $DIR cd $DIR put -p -r * close exit EOF
Enjoy! Questions, comments, and feedback are welcome and appreciated. Thank you!
In part 1, I told you how to set up JungleDisk backup for your Linux server. In this part 2, I’ll tell you how to automatically have it dump and backup your MySQL databases (correctly)!
There are security implications if this permissions are not set correctly on the files, as you’re storing your MySQL password in the script, and you’re going to have complete database dumps sitting on your hard drive. However, I will attempt to explain as clearly as possible my solution, and I’m not responsible if it doesn’t work for you.
So, start out by deciding where you want your database-dumping script to run from. A few good example spots are /root (root users home directory), and /etc/jungledisk (with the configuration files). I’ve called my backup script prebackup.sh. You’ll understand the name further below, but I’m going to use that name the rest of the way through.
So create your prebackup.sh script as root, and set it to mode 700.
touch prebackup.sh && chmod 0700 prebackup.sh
This makes sure that root is the only user who can read, write, or execute it.
Now, using your favorite text editor, you can use the following sample script:
#!/bin/bash date=`date -I` dir=/var/backups file=sqlbackup-$date.sql touch $dir/$file && chmod 600 $dir/$file && mysqldump --all-databases -p'__MySQLPassword__' >> $dir/$file find $dir -ctime +7 -delete
Danny pointed out that creating gzipped MySQL dumps, as I had in my original script, is discouraged as it defeats the data de-dup feature of jungledisk. The above script has been changed from the original to make uncompressed dumps. Thanks, Danny!
The last line, with the find statement, is responsible for the cleanup of the directory. It will delete any file which is older than 7 days. If you want more or less time, simply change
+7 to your desired number of days (keeping the
Warning: There’s very little (read: none) sanity-checking in this script. Replace
__MySQLPassword__ with your root MySQL password.
Jay pointed out that there will likely be issues with handling special characters in the SQL password. If you have any suggestions, please feel free to post them in the comments below.
After saving, you should have a 183-byte (give or take) file with 0700 mode:
root@ve:/etc/jungledisk# ls -l prebackup.sh
-rwx------ 1 root root 183 Mar 24 17:08 prebackup.sh
You should make sure that the directory in $dir is owned by user and group root and mode 0700. This will make sure that noone else has access to your dumped databases. Now that you have your script, it’s time to automate it. You could schedule a cron job to run the script, but it’s easier to have JungleDisk run it for you at the start of every backup job.
Start your management-side program, login, and go to your server’s backup configuration. Under Backup Options, check Enable Pre and Post Backup Scripts. Now, click Configure, and in the Pre-Backup Script, enter the full path and filename of your newly created script, i.e. /etc/jungledisk/prebackup.sh.
Now, on your assigned schedule, the server engine will run your database dumping script, and start your backup job. Of course, make sure whatever directory you’re dumping your databases to is included in your backup set.
The last thing to note is this script is amazingly light; it doesn’t delete any old databases and it doesn’t do much else. You’re free to modify it, and I would greatly appreciate any feedback on your modifications.
Comments are welcome, as always.
This guide assumes you’re using a Debian-based (Ubuntu, Debian) build of Linux, and we’ll be using the 64-bit download from JungleDisk. The instructions don’t really change for the 32-bit version, except for the installer file name.
The first steps:
Go over to the JungleDisk Business page and sign up for, and download, the server edition. Now, this edition comes in two very important parts: The server-side program, and the management-side program.
The server-side program is what runs on your server. That’s the “backup engine” if you will. You will download the program appropriate for your server environment.
The management-side program is the program that you remotely connect to the “server” to configure it. You will download the program appropriate for running on your desktop computer.
It is fine to download the server-side version for Linux and the management-side version for Windows, if that is your configuration. In this case, I’m downloading the Linux .deb server-side installer for 64-bit linux, and the Windows management-side program. I can’t give you the actual download links; you’ll find them in your account page.
Now, on the server, I’m going to install the server-side engine as root. Navigate to the directory where you placed the downloaded file, and run:
sudo dpkg -i junglediskserver_315-0_amd64.deb
That will install the server. Follow the directions. Now, since it’s the deb package, it will automatically set up init scripts to make sure the jungledisk engine runs on startup. However, you will notice that at the end of the setup you were prompted to copy and edit an xml file. Copy /usr/local/share/jungledisk/junglediskserver-license-EXAMPLE.xml to /etc/jungledisk/junglediskserver-settings.xml
cp /usr/local/share/jungledisk/junglediskserver-license-EXAMPLE.xml /etc/jungledisk/junglediskserver-settings.xml
Now use your favorite editor to make a few changes to /etc/jungledisk/junglediskserver-settings.xml:
, enter your license key (found in your JungleDisk account).
Now, restart the jungledisk service.
Myself, I also added my login user name and password between
respectively, but I don’t think it’s necessary. Worth keeping in mind if something doesn’t work right.
That’s all for the server-side configuration.
Now, on your management-side program, simply run the program and log into your JungleDisk account and your server should appear in the list. Double-click on it and the configuration screen will appear, where you can create backup sets and schedule them as you wish.
Want to correctly backup your MySQL databases in your backup set? See part 2 of this article, coming soon!
Comments are welcome, as always!
If you’re using Piwik Analytics, you may know that over time your Piwik database will continue to grow over time. The piwik log tables as well as the archive tables will continue to grow until you purge them.
The Piwik developers are aware of this and a ticket has been in their system for a while now to develop a way to automatically purge the Piwik logs. The FAQ describes a SQL query you can run to purge the Piwik logs manually, but you may want a way to do this automatically. Fortunately, here’s the solution, with a few caveats.
UPDATE: As of Piwik 1.5, the following method is depreciated. Piwik now has log purging implemented. See Piwik FAQ #42.
First, before proceeding, check the FAQ link for the correct SQL query to run. The SQL query has changed at least once that I’m aware of. The below guide reflects the SQL query that was posted as of the day this article was posted.
Second, this method uses a script which will contain a SQL password, so it does have some security implications. You can avoid these by making sure the script has the correct permissions. This guide will assume that your SQL user’s name is ‘piwik’ and you are running the script as unix user ‘www-data’. Please adjust for your individual configuration.
Third, and lastly, it’s suggested to run this query monthly. You will want to make sure your logs have been archived for the proceeding month. In most normal installations this is the case. The only time this would not be, is if you are archiving using a cron job and your cron job has not been running.
So here goes:
Find a place where your cron user will have access to the SQL file. A suggested place is the misc/cron directory within your piwik installation, right next to the archive.sh file. Let’s call this file purge.sql. It’s not necessary for it to not be accessible from the web, as it won’t have any sensitive information in it — just the SQL query. Create this file and paste the following SQL query into it:
DELETE piwik_log_visit, piwik_log_link_visit_action FROM piwik_log_visit INNER JOIN piwik_log_link_visit_action WHERE piwik_log_visit.idvisit = piwik_log_link_visit_action.idvisit AND visit_first_action_time <= DATE_SUB(CURDATE(), INTERVAL 30 DAY); OPTIMIZE TABLE piwik_log_visit, piwik_log_link_visit_action;
Create the script which will call the SQL query. This script will contain your SQL password, so it’s a good idea to make sure it is mode 0700 and outside a web-accessible directory. We will call this purge.sh. Assuming your database is called piwik, your SQL username is piwik, and your password is piwik123, create the file with this query:
mysql piwik -upiwik -p'piwik123' < /var/www/piwik/misc/cron/purge.sql
For your reference, the mysql command sequence is:
mysql -u -p < mysql.sql
Time to schedule the cron job. You can either edit the crontab yourself from the terminal (
crontab -e ), or use your favorite web-GUI scheduler such as cPanel, Webmin, etc.
For the terminal users, you’ll want the crontab line to read:
1 0 1 * * sh /path/to/purge.sh
This will schedule the purge.sh script to run at 12:01am on the first day of the month.
For the web-GUI users, refer to your GUI’s documentation.
Once this is done, just relax. Auto-purging will take place as scheduled.
Questions or comments about this? See anything that could be improved? Have a better way of doing this? Your comments are welcome, as always.
So after my little fiasco with plug-ins and CPU throttling, I’ve been looking for ways to make WordPress at least a little lighter and faster. I’m not going to cover disabling plug-ins, I’m going to go over a few other ways, starting with …
Every time a post is edited and/or published, a new revision is created. These stick around in the database (never deleted) and can not only grow the database, but can also lengthen query times for information. So, per WordPress codex, here’s the quick-and-dirty:and the
…simply add the following line of code to wp-config.php file located in the root or home directory of WordPress blog.
If you would rather limit the number of revisions to something small, say 2 for example, just use the number instead of
It should be added somewhere before the
require_once(ABSPATH . 'wp-settings.php'); line. That’s it. Revisions will no longer be created. If you want to delete previously created revisions, read on…
So now that you’ve disabled revisions, how do you delete all the old cruft laying around?has the answer on this one too.
…and then issue the following [SQL] command (it’s also recommended to backup the database before performing the deletion SQL statements):
DELETE FROM wp_posts WHERE post_type = "revision";
All revisions should now be deleted from the database.
Caching is a hot button for sites that could potentially see high amounts of traffic (and since we would all like to be in that category…) The caching plug-in that I use and recommend is WP Super Cache. The UI is easy enough to work around, though it does require editing of the .htaccess file.
Shared hosting providers get real upset when applications and scripts perform excessive and unoptimized database queries. Heavy themes, excessive numbers of widgets, and badly-written plug-ins all contribute to this. Fortunately, a post on CravingTech points to an easy method to check the number of queries happening on a single page load.
You can insert this snippet of code on your Footer.php file (or anywhere) to make it display the number of queries:
<?php echo $wpdb->num_queries; ?> <?php _e(‘queries’); ?>
After looking at the number of queries occurring on a page load, try changing themes, disabling plug-ins, and/or reducing the number of widgets on a page to reduce the number of queries. SQL Monitor looks like a useful plug-in for further examining SQL queries, but I haven’t used it, so I can’t comment on it’s usefulness (or lack thereof).
I’ve stumbled on some additional information while researching, and apparently the “WordPress should correct invalidly nested XHTML automatically” setting (under Settings > Writing) can not only increase the load when a post is saved, but can also break some plug-ins. If you’re familiar enough with (X)HTML to handle correctly closing tags, you might actually be better turning this off.
You can also find other settings for wp-config.php on the WordPress Codex page.
CMS systems like WordPress, Drupal, Joomla, etc are rife with plug-ins and modules you can add for extra functionality, but it’s really hard to tell the load that some of those add-ons place on your system and database. When you want to have a website that won’t collapse under load (or take the server down with it) it’s important to have well-written plug-ins. Many times we install things assuming without knowing the impact they can have. Especially in the case of shared hosting where everyone shares a server, it’s important to play nice — or face the possibility of having your hosting account suspended because you put too heavy of a load on the system.
I got to experience both sides of this first hand. Some time back I had my Drupal site hosted on my own machine. In searching for a decent chat room, I found the Drupal chat room module. It was easy to install and set up, and I had a few friends joining in to give me feedback on it. After 3 or 4 people joined, it started to get really laggy. I took a look at the server and saw that it was literally drilling a hole in the SQL server. The amount of queries and load that it was producing was just unbelievable. So I disabled it. Lesson learned.
Last night while I was making some changes here, I noticed a lot of issues with pages taking forever to load, lost connections to the MySQL server, etc. Since I’m on shared hosting, the first thing I thought of was “maybe the server is having an issue?”
So I checked server status: Nothing.
I let it go and kept making edits to my pages, but noticed the problem was getting worse instead of better. I checked server status again; nothing. I thought: There has to be a problem somewhere. So I looked through my cPanel and found an icon that I thought might give me a clue: CPU Throttling.
What you see below on the left side of the red line is what I saw (I took the screenshot 15 hours after I started working the issue so I would have a clear before-and-after).
BlueHost uses a unique CPU throttling approach, not primarily to control CPU/RAM usage, but to control scripts which pound on the SQL database and make time-consuming queries. If the database gets pounded too hard, it becomes a major issue for everyone on the server. So they throttle access to the database for load spikes, and that keeps everyone happy. They say that a throttle of 500 sec/hour is acceptable, and you shouldn’t see any slowdown from it. Obviously my problem was way beyond that.
I was obviously having an issue with some script, and it needed to be fixed immediately. Since they make log files available to you for “Slow SQL queries”, I took a read through them. I saw here and there 1.3, 1.5, 1.8 seconds… not terrible. Then I saw the issues. I had SQL queries that were running 3, 4, 5, 6, even up to 8 seconds each. You know what? It wasn’t even this site. It was the feeds module I had added to another site.
So I weighed the benefit I was seeing to the site (which was zero) and went ahead and started purging the data and disabled the modules. This took a few hours (because of the already throttled connection state), and when I was done I let it idle for an hour and made sure I was no worse off than before I had started. It was nearly 3am, so off to bed I went.
I woke up this morning and immediately checked the CPU throttling chart. Not only was I under the 500sec/hr target, I was less than half of it.
The worse part of this is that I was almost never aware of the issue. It wasn’t until I stated making bulk edits that I noticed there was a problem.
This does happen to be a system that only BlueHost offers. My only request might be that I could have gotten alerted via email when the load spike shot up so I could have been aware of the issue rather than having to find it out myself. But in any case, I saw it, I took care of it, and all is well.
Have an experience with a script that negatively impacted your CMS or server platform? Please share it below…