ORGANIZING ARTIFICIAL INTELLIGENCE STRATEGIES INTO SYSTEMS (OASIS)

Loading...
Thumbnail Image

Files

Hypolite_upenngdas_0175C_15668.pdf (10.05 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Computer and Information Science

Discipline

Computer Sciences
Data Science
Operations Research, Systems Engineering and Industrial Engineering

Subject

artificial intelligence
automated reasoning
dual process theory
machine learning
Pareto optimization
phishing

Funder

Grant number

License

Copyright date

2023

Distributor

Related resources

Contributor

Abstract

Despite their impressive accuracy, high quality oracles can be too slow to use in manyreal-time and human response scenarios, while statistical machine learning (SML) classifiers are responsive but not accurate enough and brittle to fast evolving inputs. This dissertation presents a novel approach to obtaining both accuracy and speed from an artificial intelligence (AI) system. We propose a general architecture, based on dual process theory, for composing multiple AIs of contrasting accuracy and speed objectives. The result, an AI that offers desirable trade-offs in these objectives, unattainable by its composites, and that can adapt over time to accommodate model degradation. We demonstrate how to engineer such a dual process system and a multiprocess generalization of the dual model to more than two reasoners. We apply these instantiations to the problem of detecting deceptive phishing websites. The results of a four month measurement campaign validates the architecture’s "hybrid vigor"-like advantage. The dual model reduces error rate by 70% (8.3% to 2.5%) while bringing time-per-input down from roughly 6.5s to 1s. Generalizing to the multiprocess model expands tunability, bringing latency down below 800ms with half the error of the base SML classifier. With time budgeting, we can both further reduce average time by 83% (to below 150ms) and 98th percentile time by 50% (to below 3.5s) while keeping error rate under 10%. We further show that it is possible to hide potential disruptions due to long adaptation times with negligible impact on accuracy.

Date of degree

2023

Date Range for Data Collection (Start Date)

Date Range for Data Collection (End Date)

Digital Object Identifier

Series name and number

Volume number

Issue number

Publisher

Publisher DOI

relationships.isJournalIssueOf

Comments

Recommended citation