TIPICAL - Type Inference for Python In Critical Accuracy Level

Software Engineering Research, Management and Applications

Authors: Jonathan Elkobi and Bernd Gruner and Clemens-Alexander Brust

Abstract: Type inference methods based on deep learning are becoming increasingly popular as they aim to compensate for the drawbacks of static and dynamic analysis approaches, such as high uncertainty. However, their practical application is still debatable due to several intrinsic issues such as code from different software domains will involve data types that are unknown to the type inference system. In order to overcome these problems and gain high-confidence predictions, we thus present TIPICAL, a method that combines deep similarity learning with novelty detection. We show that our method can better predict data types in high-confidence by successfully filter out unknown and inaccurate predicted data types and achieve higher f1 scores to the state-of-the-art type inference method Type4Py. Additionally, we investigate how different software domains and data type frequencies may affect the results of our method.