Packet filtering performance of basic firewalls largely affects the throughput of a network protected by the firewall. The packet filtering firewalls filter packets based on a set of filtering rules. The traditional approach for packet filtering works by checking a packet against the filtering rules by scanning from the first rule in the set and continuing to scan rules until a match is found. If no match is found, then a default rule is applied. This approach is inefficient if the number of rules is too large and majority of the packets match with rules located towards the end of the rule set. In this paper, we propose a data mining based technique for packet filtering. We consider each rule in the rule set a class. A classifier is first trained with labeled training data. Each such labeled data point contains a packet header info and the corresponding class label (i.e., rule number with which the packet matches). Then the classifier is used to classify new incoming packets. The predicted class (i.e., rule number) is checked against the packet to see if this packet really matches the predicted rule. If yes, the corresponding action (i.e., accept or deny) of the rule is taken. Otherwise (if prediction of the classifier is wrong), we go back to the traditional way of matching rules. The advantage of this data mining firewall is that it offers a much faster rule matching. We have proven both analytically and empirically that even with millions of real network traffic packets and hundreds of rules, the classifier can achieve very high accuracy, thereby making firewall six times or more faster in making filtering decision.