We participated in the Google Summer of Code 2017 (we have participated previously in 2016, 2015, 2014, and 2013). The MariaDB Foundation believes we are making a better database that remains application compatible with MySQL. We also work on making LGPL connectors (currently C, ODBC, Java) and on MariaDB Galera Cluster, which allows you to scale your reads & writes. Lately, we also have MariaDB ColumnStore, which is a columnar storage engine, designed to process petabytes of data with real-time response to analytical queries.
Please join us at irc.freenode.net
at #maria to mingle with the community. Don't forget to subscribe to maria-developers@lists.launchpad.net (this is the main list where we discuss development).
A few handy tips for any interested students who are unsure which projects to choose:Blog post from former GSoC student & mentor
The complete list of tasks suggested for GSoC 2017 is located in the MariaDB Jira. A subset is listed below.
The mysqlbinlog tool needs to be updated to understand the replication feature called Global Transaction IDs (GTIDs) in MariaDB 10. The current version does not support GTIDs and the MySQL variant does not speak MariaDB 10's GTIDs.
The purpose of this task is to create an easy-to-use facility for setting up a new MariaDB replication slave.
GIS enhancements for 10.1 that we want to work on include adding support for altitude (the third coordinate), converters (eg. ST_GeomFromGeoJSON - ST_AsGeoJSON, ST_GeomFromKML - ST_AsKML, etc.), Getting data from SHP format (shp2sql convertor), as well as making sure we are fully OpenGIS compliant.
mysqltest
is a client utility that runs tests in the mysql-test framework. It sends sql statements to the server, compares the results with the expected results, and uses a special small DSL for loops, assignments, and so on. It's pretty old and very ad hoc with many strange limitations. It badly needs a proper parser and a consistent logical grammar.
Currently one can specify only one authentication method per user. It would make a lot of sense to support multiple authentication methods per user. PAM-style. For example, one may want to authenticate using unix_socket when connecting locally, but ask for a password if connecting remotely or if unix_socket authentication failed.
Encrypting the client-server communications is closely related to authentication. Normally SSL is used for the on-the-wire encryption, and SSL can be used to authenticate the client too. GSSAPI can be used for authentication, and it has support for on-the-wire encryption. This task is about making on-the-wire encryption pluggable.
This would involve randomizing a bunch of queries (RQG based?), configurations and replication setups to search for segfaults, race conditions and perhaps invalid results.
This would involve organising a bunch of memory and threads to run on the same NUMA node. Attention to detail to ensure no additional race conditions get added in the process. A good understanding of systems programming would be useful. Ability to implement WIndows NUMA support at the same time as Linux NUMA support would be advantageous.
Skills:
C, locking and threads, Windows system programming and Innodb internals would be a plus.
Mentor:
Daniel Black
Students Interested:
1
Current Cassandra Storage Engine was developed against Cassandra 1.1 and it uses Thrift API to communicate with Cassandra. However, starting from Cassandra 1.2, the preferred way to access Cassandra database is use CQL (Cassandra Query Language) and DataStax C++ Driver (cpp-driver). Thrift-based access is deprecated and places heavy constraints on the schema.
This task is about re-implementing Cassandra Storage Engine using DataStax C++ Driver and CQL.
At the moment NULL is just the maximum integer for a column (or empty string for VARCHAR/CHAR). We need a mechanism to store NULLs separately to give us full type ranges.
Right now it is cast to double which is not great for obvious reasons. It will mean modifying a lot of ColumnStore's version of MariaDB's function implementations and allowing column files to store more than 8 bytes per field.
This includes collations and anything that works on the length of the string.
Do you have an idea of your own, not listed above or in Jira? Do let us know!
This page is licensed: CC BY-SA / Gnu FDL