Bash Shell: Working with large text files

When working with multi-Gb text files I use these commands:

1. Get the first line which often contains column names and dump it into a small text file



uki $ head -n 1 source_file_name.txt > header_line.txt



2. Get first record after the headline and dump it into a small text file

uki $ head -n 2 source_file_name.txt | tail -1 > first_data_line.txt 

3. Finally, when developing using large files, I take SAMPLE 1000 records (out of millions) to speed up the dev time, I use 1000 because that is default SELECT * number of records in MySQL, but you can use any other if you want, but I would not go too small as you many not catch memory leak errors. The random number 2500 in this example I would change occasionally to pull different sample. You do want to sample your data in different places.


uki $ head -n 2500 source_file_name.txt | tail -1000 > sample_1000_records.txt 

Resulting files:

Eclipse: increase size of Console buffer size

Then you are executing applications that have a lot System.out.print.. output, you might want to increate size of your Console to hold more text.

Eclipse > Preferences > type in Console in search > Run/Debug > Console

Increase buffer size from default 80,000 characters to MAX 999999.


Java: importing data from a big text file to MySQL

These 2 classes show you how to read a HUGE TEXT file and insert data from each line to the MySQL database efficiently. The parsing of the lines is out of scope of this exercise as it will be different for each application.


package com.your_package.data;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.Date;

public class YourClassName
{

public static void main(String[] args)
{
String db = "your_db_name";
String user = "root";
String password = "your_password";
Connection connection = makeDbConnection(db, user, password);

BigTextFile file = null;
try
{
file = new BigTextFile("/Users/uki/Documents/file_name.txt");
} catch (Exception e)
{
e.printStackTrace();
}

for (String line : file)
{
// process line here the way you want it
System.out.println(line.substring(line.lastIndexOf("|") + 1));

String partNumber = "0001";
String attributeName = "some name";
String attributeValue = "some value";

String sql = buildInsertSqlStatement(attributeName, attributeValue, partNumber);
insertIntoDb(connection, sql);

break; // remove after all works
}

}

private static String buildInsertSqlStatement(String attributeName, String attributeValue, String partNumber)
{
StringBuffer sql = new StringBuffer();
sql.append("INSERT INTO your_database_name.table_name VALUES (");
sql.append("'" + attributeName + "'");
sql.append(", '" + attributeValue + "'");
sql.append(", '" + partNumber + "' );");
return sql.toString();
}

private static void insertIntoDb(Connection connection, String sql)
{
final String TAG = YourClassName.class.getCanonicalName();
try
{
Statement st = connection.createStatement();

System.out.println(TAG + "Executing: " + sql);
int val = st.executeUpdate(sql.toString());
System.out.println(TAG + " Returned: " + val);

} catch (SQLException e)
{
System.out.println("SQL insert failed " + e);
}

System.out.println(TAG + "Finished " + new Date());
}

private static Connection makeDbConnection(String db, String user, String password)
{
Connection con = null;
try
{
String url = "jdbc:mysql://localhost:3306/";
String driver = "com.mysql.jdbc.Driver";
Class.forName(driver);
con = DriverManager.getConnection(url + db, "root", "");
} catch (Exception e)
{
e.printStackTrace();
}
return con;
}
}





package com.your_package.data;

import java.io.BufferedReader;
import java.io.FileReader;
import java.util.Iterator;

public class BigTextFile implements Iterable
{
private class FileIterator implements Iterator
{
private String line;

public boolean hasNext()
{
try
{
line = bufferedReader.readLine();
} catch (Exception ex)
{
line = null;
ex.printStackTrace();
}

return line != null;
}

public String next()
{
return line;
}

public void remove()
{
}
}

private BufferedReader bufferedReader;

public BigTextFile(String filePath) throws Exception
{
bufferedReader = new BufferedReader(new FileReader(filePath));
}

public void Close()
{
try
{
bufferedReader.close();
} catch (Exception ex)
{
}
}

public Iterator iterator()
{
return new FileIterator();
}
}

MySQL: importing table data from text file


You can import table data from text file.

If columns are matched in order to delimited file, then it is very easy, example data:

column1data|column2data|column3data| |column5data
column1data|column2data|column3data|column4data|column5data


Note that the file name should be the exact name of the table you are writing to:


uki: ~ $ mysqlimport --local -u root --fields-terminated-by="|" sears_product_info /Users/uki/Documents/XYZ/table_name.txt 

database_name. table_name: Records: 526733  Deleted: 0  Skipped: 526733  Warnings: 557695


Note that if the record already exists then it will not be imported again (a good feature for me).
If you want to clean the table content then you execute 
DELETE FROM database_name. table_name
The import of 500,000 records may take up to 10 minutes on MacBook Pro.

Android: obtaining Google Map apiKey

TO GET MD5 for DEVELOPERS (DEBUG):


STEP 1. Find you debug.keystore file 




STEP 2a. Open Terminal and execute keytool command

$ keytool -list -storepass android -keystore /Users/uki/.android/debug.keystore


androiddebugkey, Jan 21, 2010, PrivateKeyEntry, 
Certificate fingerprint (MD5): A0:AC:1A:E2:E7:06:C2:93:CF:9E:xxxxxxx.....

skip to STEP 3


STEP 2b.


FOR DEPLOYMENT (not local development):
Build (Export) your application for deployment at least once to you have your deployment signature. Follow the wizard and make certificate for 35 years.


Use the same keytool utility:

uki: ~ $  keytool -list -keystore /some_directory/keystore_name.keystore
Enter keystore password:

Keystore type: JKS
Keystore provider: SUN

Your keystore contains 13 entries

[...]
taxi share - national, Sep 21, 2011, PrivateKeyEntry,
Certificate fingerprint (MD5): 15:A1:16:70:.....................................................
[...]
uki: ~ $







4. Copy and paste the apiKey to your Maps



Select longest records from the database

To find the longest record in your table you can use this query:


SELECT max(length(column_name)) from table_name 
This can help you in optimizing the size of your tables.


SELECT column_name FROM table_nameWHERE length(column_name) =
( SELECT max(length(
( SELECT max(length(column_name)) from table_name );


Result is a list of records with the longest length



Execute SQL script (MySQL dump) from command line

1) Find you mysql installation directory:

uki: ~ $ cd /usr/local/mysql/bin

2) Start mysql shell
uki: ~ $ ./mysql -u root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 238
Server version: 5.5.9 MySQL Community Server (GPL)
...

3) Select database you want to execute script agaist
mysql> use my_database_name
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

4) Execute SQL script
in the next line notice "slash dot space"

mysql> \. /Users/uki/Documents/script_name.sql 

Add MySQL to PATH

1.  Open Terminal go to your home directory
uki: ~ $ cd ~

2. see if you already created .bash_profile


uki: ~ $ ls .bash*
.bash_history .bash_profile

3. if .bash_profile is not there execute
uki: ~ $ touch .bash_profile

4. open file for editing
uki: ~ $ open .bash_profile 

or using your favorite editor
uki: ~ $ bbedit .bash_profile 
5. add content to the file


# User specific Terminal config
if [ -n "$PS1" ]; then PS1='\u: \w \$ '; fi
shopt -s checkwinsize
date -u
export PATH=$PATH:/usr/local/bin
####### JAVA DEV #######
export JAVA_HOME=/Library/Java/Home/
export PATH=$PATH:$JAVA_HOME/bin
export PATH=$PATH:/usr/local/mysql/bin
6.  RE-OPEN the Terminal
7. try to run mysql command

uki: ~ $ mysql -u root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 236
Server version: 5.5.9 MySQL Community Server (GPL)
Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> 



Placing Android pop-up toasts on the screen


You can place your toasts anywhere on the screen you want. In this case I put it in the center of the screen with 20 pixels off the left margin and 40 below the center line.

Toast toast = Toast.makeText(TrucksScreen.this, getString(R.string.fetching_map_details), Toast.LENGTH_SHORT);
toast.setGravity(Gravity.CENTER, -20, -40);

toast.show();

Solr Search Engine

Starting the engine with Jetty (included J2EE container):

~ uki$ cd /Users/uki/solr/apache-solr-3.3.0/exampleexample uki$ java -jar start.jar


Indexing XML:

~ uki$ cd /Users/uki/solr/apache-solr-3.3.0/example/exampledocs

ushofml299009:exampledocs uki$ java -jar post.jar /Users/uki/Documents/workspace/Spin\ XSLT/SPIN_43.xml


Statistics:

http://localhost:8983/solr/admin/stats.jsp

numDocs : 17