This paper suggests a new algorithm for data compression that depends on Boolean minimization of binary data. On the compressor side, the input bit-stream is chopped into chunks of 16-bit each, and a "sum of products" function is found for each chunk of bits using the Quine-McClusky algorithm. The minimized "sum of products" function is stored in a file. Later, the Huffman coding is applied to this file. The obtained Huffman code is used to convert the original file into a compressed one. On the decompression side, the Huffman tree is used to retrieve the original file. The experimental results of the proposed algorithm showed that the saving ratio on average is around 50%. In addition, the worst case was investigated and a remedy to it was suggested. The proposed technique can be used for various file formats including images and videos.
Text categorization is based on the idea of content-based texts clustering. An Artificial Neural Network (ANN) or simply Neural Network (NN) classifier for Arabic texts categorization is proposed. The Singular Value Decomposition (SVD) is used as preprocessor with the aim of further reducing data in terms of both size and dimensionality. Indeed, the use of SVD makes data more amenable to classification and the convergence training process faster. Specifically, the effectiveness of the Multilayer Perceptron (MLP) and the Radial Basis Function (RBF) classifiers are implemented. Experiments are conducted using an in-house corpus of Arabic texts. Precision, recall and F-measure are used to quantify categorization effectiveness. The results show that the proposed SVD-Supported MLP/RBF ANN classifier is able to achieve high effectiveness. Experimental results also show that the MLP classifier outperforms the RBF classifier and that the SVD-supported NN classifier is better than the basic NN, as far as Arabic text categorization is concerned.