Category Archives: Databases

Alter mysql tables on the fly, without locking them

If you need to perform real-time ALTER TABLE processes on MySQL (InnoDB, TokuDB) tables, a great tool for the job is the Percona Toolkit.

The Percona Toolkit includes a utility (pt-online-schema-change) to perform such a process, without write-locking tables, or having to manually create a temporary table and triggers to synchronize the data during the process.

[codesyntax lang="bash"]

time pt-online-schema-change --host my-awesome-database.example.net --user=user --password=secret --execute --print --no-drop-old-table --alter "DROP INDEX kenny" D=my-database,t=my-table

[/codesyntax]

[codesyntax lang="bash"]

time pt-online-schema-change --hostmy-awesome-database.example.net --user=user --password=secret --execute --print --no-drop-old-table --alter "add source_id int(10) unsigned DEFAULT NULL"D=my-database,t=my-table

[/codesyntax]

MariaDB 10.0: How to migrate from InnoDB to TokuDB

This is only to remember what I did to migrate from InnoDB to TokuDB on a MariaDB cluster.
TokuDB settings are specific to my environment, so please don't ask me what/how/why.

root@dbrfoo:~# mysql -u root -p
MariaDB [(none)]> INSTALL SONAME 'ha_tokudb';
Query OK, 0 rows affected (0.07 sec)

MariaDB [(none)]> show engines;
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                                    | Transactions | XA   | Savepoints |
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables                  | NO           | NO   | NO         |
| MRG_MyISAM         | YES     | Collection of identical MyISAM tables                                      | NO           | NO   | NO         |
| MyISAM             | YES     | MyISAM storage engine                                                      | NO           | NO   | NO         |
| BLACKHOLE          | YES     | /dev/null storage engine (anything you write to it disappears)             | NO           | NO   | NO         |
| CSV                | YES     | CSV storage engine                                                         | NO           | NO   | NO         |
| TokuDB             | YES     | Tokutek TokuDB Storage Engine with Fractal Tree(tm) Technology             | YES          | YES  | YES        |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                                         | NO           | NO   | NO         |
| ARCHIVE            | YES     | Archive storage engine                                                     | NO           | NO   | NO         |
| InnoDB             | DEFAULT | Percona-XtraDB, Supports transactions, row-level locking, and foreign keys | YES          | YES  | YES        |
| FEDERATED          | YES     | FederatedX pluggable storage engine                                        | YES          | NO   | YES        |
| Aria               | YES     | Crash-safe tables with MyISAM heritage                                     | NO           | NO   | NO         |
+--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+
11 rows in set (0.00 sec)

MariaDB [(none)]> show plugins;
+-------------------------------+----------+--------------------+--------------+---------+
| Name                          | Status   | Type               | Library      | License |
+-------------------------------+----------+--------------------+--------------+---------+
| binlog                        | ACTIVE   | STORAGE ENGINE     | NULL         | GPL     |
| mysql_native_password         | ACTIVE   | AUTHENTICATION     | NULL         | GPL     |
| mysql_old_password            | ACTIVE   | AUTHENTICATION     | NULL         | GPL     |
| MEMORY                        | ACTIVE   | STORAGE ENGINE     | NULL         | GPL     |
| MyISAM                        | ACTIVE   | STORAGE ENGINE     | NULL         | GPL     |
| CSV                           | ACTIVE   | STORAGE ENGINE     | NULL         | GPL     |
| MRG_MyISAM                    | ACTIVE   | STORAGE ENGINE     | NULL         | GPL     |
| PERFORMANCE_SCHEMA            | ACTIVE   | STORAGE ENGINE     | NULL         | GPL     |
| InnoDB                        | ACTIVE   | STORAGE ENGINE     | NULL         | GPL     |
| XTRADB_READ_VIEW              | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| XTRADB_INTERNAL_HASH_TABLES   | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| XTRADB_RSEG                   | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_TRX                    | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_LOCKS                  | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_LOCK_WAITS             | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_CMP                    | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_CMP_RESET              | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_CMPMEM                 | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_CMPMEM_RESET           | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_CMP_PER_INDEX          | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_CMP_PER_INDEX_RESET    | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_BUFFER_PAGE            | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_BUFFER_PAGE_LRU        | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_BUFFER_POOL_STATS      | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_METRICS                | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_FT_DEFAULT_STOPWORD    | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_FT_DELETED             | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_FT_BEING_DELETED       | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_FT_CONFIG              | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_FT_INDEX_CACHE         | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_FT_INDEX_TABLE         | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_SYS_TABLES             | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_SYS_TABLESTATS         | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_SYS_INDEXES            | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_SYS_COLUMNS            | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_SYS_FIELDS             | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_SYS_FOREIGN            | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_SYS_FOREIGN_COLS       | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_SYS_TABLESPACES        | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_SYS_DATAFILES          | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| INNODB_CHANGED_PAGES          | ACTIVE   | INFORMATION SCHEMA | NULL         | GPL     |
| ARCHIVE                       | ACTIVE   | STORAGE ENGINE     | NULL         | GPL     |
| Aria                          | ACTIVE   | STORAGE ENGINE     | NULL         | GPL     |
| FEDERATED                     | ACTIVE   | STORAGE ENGINE     | NULL         | GPL     |
| BLACKHOLE                     | ACTIVE   | STORAGE ENGINE     | NULL         | GPL     |
| FEEDBACK                      | DISABLED | INFORMATION SCHEMA | NULL         | GPL     |
| partition                     | ACTIVE   | STORAGE ENGINE     | NULL         | GPL     |
| TokuDB                        | ACTIVE   | STORAGE ENGINE     | ha_tokudb.so | GPL     |
| TokuDB_trx                    | ACTIVE   | INFORMATION SCHEMA | ha_tokudb.so | GPL     |
| TokuDB_lock_waits             | ACTIVE   | INFORMATION SCHEMA | ha_tokudb.so | GPL     |
| TokuDB_locks                  | ACTIVE   | INFORMATION SCHEMA | ha_tokudb.so | GPL     |
| TokuDB_file_map               | ACTIVE   | INFORMATION SCHEMA | ha_tokudb.so | GPL     |
| TokuDB_fractal_tree_info      | ACTIVE   | INFORMATION SCHEMA | ha_tokudb.so | GPL     |
| TokuDB_fractal_tree_block_map | ACTIVE   | INFORMATION SCHEMA | ha_tokudb.so | GPL     |
+-------------------------------+----------+--------------------+--------------+---------+
54 rows in set (0.00 sec)

MariaDB [(none)]> \q

root@dbrfoo:~# vim /etc/mysql/conf.d/tokudb.cnf

[mariadb]
# See https://mariadb.com/kb/en/how-to-enable-tokudb-in-mariadb/
# for instructions how to enable TokuDB
#
# See https://mariadb.com/kb/en/tokudb-differences/ for differences
# between TokuDB in MariaDB and TokuDB from http://www.tokutek.com/

#plugin-load=ha_tokudb.so

tokudb_cache_size=300GB
tokudb_commit_sync=off
tokudb_fsync_log_period=1000
tokudb_directio=ON
tokudb_disable_slow_alter=ON
tokudb_disable_hot_alter=OFF
tokudb_load_save_space=ON
tokudb_row_format=tokudb_fast


root@dbrfoo:~# /etc/init.d/mysql stop
root@dbrfoo:~# /etc/init.d/mysql start


----------------
150615 14:35:57 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
150615 14:36:09 mysqld_safe Starting mysqld daemon with databases from /data/db/mysql
150615 14:36:10 [Note] /usr/sbin/mysqld (mysqld 10.0.19-MariaDB-1~wheezy-log) starting as process 20812 ...
150615 14:36:10 [Note] InnoDB: Using mutexes to ref count buffer pool pages
150615 14:36:10 [Note] InnoDB: The InnoDB memory heap is disabled
150615 14:36:10 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
150615 14:36:10 [Note] InnoDB: Memory barrier is not used
150615 14:36:10 [Note] InnoDB: Compressed tables use zlib 1.2.7
150615 14:36:10 [Note] InnoDB: Using Linux native AIO
150615 14:36:10 [Note] InnoDB: Using CPU crc32 instructions
150615 14:36:10 [Note] InnoDB: Initializing buffer pool, size = 100.0G
150615 14:36:16 [Note] InnoDB: Completed initialization of buffer pool
150615 14:36:17 [Note] InnoDB: Highest supported file format is Barracuda.
150615 14:36:20 [Note] InnoDB: 128 rollback segment(s) are active.
150615 14:36:20 [Note] InnoDB: Waiting for purge to start
150615 14:36:20 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.23-72.1 started; log sequence number 38210935651348
150615 14:36:20 [Note] Plugin 'FEEDBACK' is disabled.
150615 14:36:20 [Note] Server socket created on IP: '::'.
150615 14:36:20 [Note] Event Scheduler: Loaded 0 events
150615 14:36:20 [Note] /usr/sbin/mysqld: ready for connections.
Version: '10.0.19-MariaDB-1~wheezy-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  mariadb.org binary distribution
----------------

root@dbrfoo:~# mysql -u root -p

MariaDB [(none)]> use xxxxxxxx_ugc;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

MariaDB [xxxxxxxx_ugc]> alter table user_data engine=tokudb;
Stage: 1 of 2 'Fetched about 161900000 rows, loading data still remains'   3.49% of stage done

Custom Elasticsearch template with custom field mapping

Disclaimer: I am writing this post mostly for me to remember how and what I did to solve wrong field mapping in elasticsearch.

Quote from Elasticsearch Dynamic Mapping documentation page

When Elasticsearch encounters a previously unknown field in a document, it uses dynamic mapping to determine the datatype for the field and automatically adds the new field to the type mapping.

Sometimes this is the desired behavior and sometimes it isn't. Perhaps you don't know what fields will be added to your documents later, but you want them to be indexed automatically. Perhaps you just want to ignore them. Or - especially if you are using Elasticsearch as a primary data store - perhaps you want unknown fields to throw an exception to alert you to the problem.

I had this annoying problem with a field that was mapped as date when it was supposed to be a string.

It is worth mentioning that existing type and field mappings cannot be updated because that would mean invalidating already indexed documents and the right way would be to create a new index with the correct mappings and reindex the data into that index.

Ok, cool, but how do I tell elasticsearch to map that particular field to string? Well, we need to create a template that will automatically be applied when new indices are created. The template could include both settings and mappings, and a simple pattern template that controls whether the template should be applied to the new index.

Let's get to work!

What is the full path of my field (stats_dates in this case)?
[codesyntax lang="bash"]

curl -s -XGET http://localhost:9200/logstash-2016.10.05/_mappings | jq 'path(recurse(if type|. == "array" or . =="object" then .[] else empty end))'

[/codesyntax]

[
...
  "logstash-2016.10.05",
  "mappings",
  "hemlock",
  "properties",
  "stats_dates",
  "type"
]
...

Ok, so the full path is mappings.hemlock.properties.stats_dates.type. Now that we the full path, let's create the template.
[codesyntax lang="bash"]

curl -s -XPUT http://localhost:9200/_template/logstash-stats_dates -d '

{
  "order": 0,
  "template": "logstash-*",
  "settings": {
  },
  "mappings": {
    "hemlock": {
      "properties": {
        "stats_dates": {
          "type": "string"
        }
      }
    }
  },
  "aliases": {
  }
}

'

[/codesyntax]

Check if the template was applied to the new index:

[codesyntax lang="bash"]

curl -s http://localhost:9200/logstash-2016.10.06/_mappings | jq '.[].mappings.hemlock.properties.stats_dates'

[/codesyntax]

{
  "type": "string"
}

Marvel indices taking lot of space? Delete indices older than 7 days!

It looks like Marvel is generating some data everyday. Is there a way to reduce the amount of data generated by marvel? The short answer to the above question is Yes!

[codesyntax lang="bash"]

curator --host 127.0.0.1 show indices --older-than 30 --time-unit days --timestring '%Y.%m.%d' --prefix .marvel
curator --host 127.0.0.1 close indices --older-than 30 --time-unit days --timestring '%Y.%m.%d' --prefix .marvel
curator --host 127.0.0.1 delete indices --older-than 30 --time-unit days --timestring '%Y.%m.%d' --prefix .marvel

[/codesyntax]

If we want to remove all indices from February 2015:

[codesyntax lang="bash"]

curator --host 127.0.0.1 show indices --regex '\.marvel-2015\.02\.*'
curator --host 127.0.0.1 close indices --regex '\.marvel-2015\.02\.*'
curator --host 127.0.0.1 delete indices --regex '\.marvel-2015\.02\.*'

[/codesyntax]

Dumping multiple mysql tables at once

Could happen to have a mysql database with a lot of tables and we want dump all of them, but we also want to make the logical dump process faster. Some of you could say that mydumper is the solution. It could be. But what if we don't have the permission to install it or whatever?

Let's say we have these tables and we need to dump all of them:

[codesyntax lang="bash"]

MariaDB [database]> show tables like "objects_2015%";
+-----------------------------------------------+
| Tables_in_xxxxxxxxxx_logs (objects_2015%)     |
+-----------------------------------------------+
| objects_2015_12_01                            |
| objects_2015_12_02                            |
| objects_2015_12_03                            |
| objects_2015_12_04                            |
| objects_2015_12_05                            |
| objects_2015_12_06                            |
| objects_2015_12_07                            |
| objects_2015_12_08                            |
| objects_2015_12_09                            |
| objects_2015_12_10                            |
| objects_2015_12_11                            |
| objects_2015_12_12                            |
| objects_2015_12_13                            |
| objects_2015_12_14                            |
| objects_2015_12_15                            |
| objects_2015_12_16                            |
| objects_2015_12_17                            |
| objects_2015_12_18                            |
| objects_2015_12_19                            |
| objects_2015_12_20                            |
| objects_2015_12_21                            |
| objects_2015_12_22                            |
| objects_2015_12_23                            |
| objects_2015_12_24                            |
| objects_2015_12_25                            |
| objects_2015_12_26                            |
| objects_2015_12_27                            |
| objects_2015_12_28                            |
| objects_2015_12_29                            |
| objects_2015_12_30                            |
| objects_2015_12_31                            |
+-----------------------------------------------+
74 rows in set (0.00 sec)

[/codesyntax]

Dumping, let's say, 6 tables in parallel will speed up the dump process.
Note: Too many mysqldump instances could overload the server and make the dump process slower (if the server is a production server this could have quite an impact on the performance).

We need the list with all the commands we need to execute:

[codesyntax lang="bash"]

for i in $(mysql -h host -uuser -ppassword database -Bsqe "show tables like \"objects_2015%\""); do echo "mysqldump -h host -uuser -ppassword --add-drop-table --quick --quote-names --disable-keys --extended-insert database $i | gzip > /path/to/backup/$i.sql.gz"; done > /tmp/things_to_do/tables.txt

[/codesyntax]

Start the dump process:
[codesyntax lang="bash"]

parallel -j 6 < /tmp/things_to_do/tables.txt

[/codesyntax]

Kill all mysql queries having query time greater than 1 minute

At this point there are two approaches to achieve this. One is using pt-kill from Percona Toolkit, and the other one is to use a bash script with a lot of pipes :)
Why would someone use the second approach? I don't know, perhaps because there is no Percona Toolkit available.

[codesyntax lang="bash"]

for i in $(mysql -e "show processlist" | egrep -v "system user" | grep '^[0-9]' | awk '{print $6" "$1}' | sort -nr -k1 | awk -v threshold="60" '$1 > threshold' | awk '{print $2}'); do mysql -e "kill $i"; done

[/codesyntax]

If you can/want to use pt-kill and don't know how, please read this.

Natural Sorting in MySQL

The data:

select id, name from Object where name like "dbrferrari%";
+-----+--------------+
| id  | name         |
+-----+--------------+
|   8 | dbrferrari5  |
|   9 | dbrferrari6  |
|  25 | dbrferrari1  |
|  26 | dbrferrari2  |
|  35 | dbrferrari3  |
|  64 | dbrferrari4  |
|  80 | dbrferrari7  |
|  99 | dbrferrari11 |
| 101 | dbrferrari8  |
| 102 | dbrferrari10 |
| 133 | dbrferrari12 |
| 134 | dbrferrari15 |
| 135 | dbrferrari14 |
| 199 | dbrferrari9  |
| 200 | dbrferrari16 |
| 202 | dbrferrari18 |
| 211 | dbrferrari13 |
+-----+--------------+

The problem:

select id, name from Object where name like "dbrferrari%" order by name asc;
+-----+--------------+
| id  | name         |
+-----+--------------+
|  25 | dbrferrari1  |
| 102 | dbrferrari10 |
|  99 | dbrferrari11 |
| 133 | dbrferrari12 |
| 211 | dbrferrari13 |
| 135 | dbrferrari14 |
| 134 | dbrferrari15 |
| 200 | dbrferrari16 |
| 202 | dbrferrari18 |
|  26 | dbrferrari2  |
|  35 | dbrferrari3  |
|  64 | dbrferrari4  |
|   8 | dbrferrari5  |
|   9 | dbrferrari6  |
|  80 | dbrferrari7  |
| 101 | dbrferrari8  |
| 199 | dbrferrari9  |
+-----+--------------+

The solution:

select id, name from Object where name like "dbrferrari%" order by LENGTH(name), name asc;
+-----+--------------+
| id  | name         |
+-----+--------------+
|  25 | dbrferrari1  |
|  26 | dbrferrari2  |
|  35 | dbrferrari3  |
|  64 | dbrferrari4  |
|   8 | dbrferrari5  |
|   9 | dbrferrari6  |
|  80 | dbrferrari7  |
| 101 | dbrferrari8  |
| 199 | dbrferrari9  |
| 102 | dbrferrari10 |
|  99 | dbrferrari11 |
| 133 | dbrferrari12 |
| 211 | dbrferrari13 |
| 135 | dbrferrari14 |
| 134 | dbrferrari15 |
| 200 | dbrferrari16 |
| 202 | dbrferrari18 |
+-----+--------------+

How to add debug symbols for MariaDB Debian/Ubuntu packages

I don't know about other distributions, but I know that the debug symbols were stripped from Debian/Ubuntu packages. If there are some crashes reported I won't be able to fully analyze them. The only way to fix this problem is to build the packages again.

[codesyntax lang="bash"]

git clone https://github.com/MariaDB/server.git
cd server/
git branch -a
git checkout 10.0

apt-get install libdistro-info-perl
apt-get install fakeroot
apt-get install libreadline-gplv2-dev libpam0g-dev dpatch libjemalloc-dev
apt-get install libboost-all-dev libjudy-dev libjudydebian1
apt-get install build-essential dpkg-dev devscripts hardening-wrapper
apt-get build-dep mysql-server

patch -p1 < /path/to/patch.txt

./debian/autobake-deb.sh

[/codesyntax]

Note: Here is the patch.txt

mysql hot backup with xtrabackup

I want to backup my databases without downtime... Well, there are couple approaches to this. One of them is to add a standby slave and use LVM snapshot approach. The second approach is to use percona xtrabackup. My problem is that I have some huge databases here, so taking local backups is out of the question.

But just to remain written this is how to do local hot backups.

1. Take the backup
[codesyntax lang="bash"]

time innobackupex --user=user --password=password --no-timestamp --rsync --slave-info --safe-slave-backup /path/mysql

[/codesyntax]

2. At this point the data is not ready to be restored. There might be uncommitted transactions to be undone or transactions in the logs to be replayed. Doing those pending operations will make the data files consistent and it is the purpose of the prepare stage. Once this has been done, the data is ready to be used.

[codesyntax lang="bash"]

time innobackupex --apply-log --use-memory=12G /path/mysql

[/codesyntax]

Since in my case having storing the backup locally (not even temporary) is not an option I need to transport the whole thing to the storage server.
The lame solution is to use NFS... but c'mon... grrrr... no!

The right solution is to stream the data directly to the storage server. To achieve this I am using netcat. Also to gain more performance I am compressing the data on the fly.

1. On the target machine
[codesyntax lang="bash"]

nc -l -p 9999 | qpress -dio | xbstream -x -C /path/to/mysql

[/codesyntax]

2. On the source machine
[codesyntax lang="bash"]

time innobackupex --user=user --password=pass --no-timestamp --slave-info --parallel=$((`nproc`-2)) --safe-slave-backup --stream=xbstream /data/db/mysql | qpress -io something | nc destination 9999

[/codesyntax]

3. On the target machine
Apply the transaction log in order to make the backup consistent.

[codesyntax lang="bash"]

time innobackupex --apply-log --use-memory=120G /path/to/mysql

[/codesyntax]

Note: --use-memory option is used to speed up the process by using more memory (the default is 100MB which is a quite small value).

How to recover the space used by UNDO_LOG in InnoDB tablespaces

The panic started when the monitoring server sent an alert about the storage of one of our MySQL servers - saying that the disk is about to get filled.

I realized that a a lot of the disk space was used by the InnoDB's shared tablespace ibdata1. But wait... I have innodb_file_per_table enabled. So, what is stored in ibdata1?

According to this serverfault post, ibdata1 contains vital InnoDB informations:

  • Table Data Pages
  • Table Index Pages
  • Data Dictionary
  • MVCC Control Data
    • Undo Space
    • Rollback Segments
  • Double Write Buffer (Pages Written in the Background to avoid OS caching)
  • Insert Buffer (Changes to Secondary Indexes)

Next step was to download the InnoDB Ruby Tools made by Jeremy Cole and to check what is being stored in the ibdata1.

[codesyntax lang="bash"]

ruby ~/innodb_ruby/bin/innodb_space -f ibdata1,data1,data2 space-page-type-summary
type                count       percent     description
UNDO_LOG            1969570     83.55       Undo log
INDEX               230460      9.78        B+Tree index
IBUF_FREE_LIST      115168      4.89        Insert buffer free list
INODE               22337       0.95        File segment inode
ALLOCATED           19424       0.82        Freshly allocated
IBUF_BITMAP         143         0.01        Insert buffer bitmap
XDES                142         0.01        Extent descriptor
SYS                 130         0.01        System internal
FSP_HDR             1           0.00        File space header
TRX_SYS             1           0.00        Transaction system header

[/codesyntax]

So, it has 1969570 UNDO_LOG pages which is almost 84% of the table space...

Now, according to some stackoverflow post shrinking or purging ibdata1 file is NOT possible without doing a logical dump/import.

Well... I managed to reclaim that space!

  • I dumped the structure of the databases

[codesyntax lang="bash"]

mysqldump --no-data database1 > database1.sql
mysqldump --no-data database2 > database2.sql

[/codesyntax]

  • I stopped the mysql server

[codesyntax lang="bash"]

/etc/init.d/mysql stop

[/codesyntax]

  • I renamed mysql dir (/data/db/mysql in my case)

[codesyntax lang="bash"]

mv /data/db/mysql /data/db/mysql.bak
mkdir -vp /data/db/mysql
cp -av /data/db/mysql.bak/mysql /data/db/mysql/
cp -av /data/db/mysql.bak/{aria_log.00000001,aria_log_control} /data/db/mysql/

[/codesyntax]

  • I did some clean up

[codesyntax lang="bash"]

rm -vf /data/db/mysql/mysql/innodb_{index,table}_stats.*
rm -vf /data/db/mysql/mysql/gtid_slave_pos.*

[/codesyntax]

  • Last few things before starting mysql

[codesyntax lang="bash"]

mkdir -vp /data/db/mysql/tmp
chown mysql:mysql /data/ -R

[/codesyntax]

  • I started mysql and checked tables for MySQL upgrade (this is really necessary in order to recreate innodb_{index,table}_stats tables).

[codesyntax lang="bash"]

/etc/init.d/mysql start
mysql_upgrade -uuser -ppassword --force
/etc/init.d/mysql restart

[/codesyntax]

  • No errors in mysql log, so I imported the dumps I created before

[codesyntax lang="bash"]

mysql database1 < database1.sql
mysql database2 < database2.sql

[/codesyntax]

  • I discarded the tables space for every single table

[codesyntax lang="bash"]

alter table database1.table1 discard tablespace;
alter table database1.table2 discard tablespace;
alter table database1.tableX discard tablespace;

alter table database2.table1 discard tablespace;
alter table database2.table2 discard tablespace;
alter table database2.tableX discard tablespace;

[/codesyntax]

  • I moved the data files (the .ibd files) and imported the table space for every single table

[codesyntax lang="bash"]

alter table database1.table1 import tablespace;
alter table database1.table2 import tablespace;
alter table database1.tableX import tablespace;

alter table database2.table1 import tablespace;
alter table database2.table2 import tablespace;
alter table database2.tableX import tablespace;

[/codesyntax]

Good luck!