GPU Accelerated Theil-Sen Estimator for Censored Data

Matthew Thomas

Matthew Thomas

Evanston, Illinois

1 0
  • 0 Collaborators

A GPU accelerated version of the estimator described in Akritas, Murphy, Lavalley 1995. The estimator works for censored independent and response variables even when the process of censorship is unknown. The estimator is highly outlier resistant. I want to make a better one. ...learn more

Project status: Under Development

Artificial Intelligence

Groups
Student Developers for AI

Intel Technologies
Intel Python, MKL

Code Samples [1]

Overview / Usage

A well-cited paper suggests that a version of the Theil-Sen estimator is ideal for censored data based on simulations. The estimator is frequently used to interpret data that comes from sensors and telescopes. Because interpreting censored data from censors is very important in AI and machine learning, this estimator is useful and any improvements that can be made to it are valuable.

In the paper, we have (Xt,Yt) iid random variables that would be observed in the absence of censuring. As usual: Yt=a+b(Xt)+u Instead of observing the true data, we observe the quartet (X,Y,Dx,Dy) where X = min(Xt,Xc), Y = min(Yt,Yc), and Dx and Dy indicate whether X=Xt and Y=Yt respectively (that is, whether that observation has uncensored data). We are interested in estimating b and do not care about the accuracy of a.

I am concerned with finding an estimator which improves upon this one. Quantile regression methods seem to deliver more efficiency but are biased in this context.

Methodology / Approach

The function is written in Python using the cuPy library to achieve GPU acceleration.

Technologies Used

The function is written entirely in cuPy which guarantees that the code is GPU accelerated. Because the estimator mostly uses matrix operations, the implementation is very fast. The portions that do not benefit from GPU acceleration are still accelerated through the use of Intel MKL if the user runs the code in Intel Python or Anaconda.

Repository

https://gist.github.com/MattWThomas/25f69988105de9a801f6b7fae5e6375d

Comments (0)