The Fort Worth Press - Inner workings of AI an enigma - even to its creators

USD -
AED 3.672498
AFN 66.135424
ALL 82.428003
AMD 381.697608
ANG 1.790403
AOA 917.000333
ARS 1440.719298
AUD 1.503556
AWG 1.8
AZN 1.698617
BAM 1.6671
BBD 2.013298
BDT 122.155689
BGN 1.666095
BHD 0.376959
BIF 2954.536737
BMD 1
BND 1.290974
BOB 6.906898
BRL 5.403152
BSD 0.999616
BTN 90.396959
BWP 13.244683
BYN 2.94679
BYR 19600
BZD 2.010374
CAD 1.37658
CDF 2240.000343
CHF 0.795735
CLF 0.023238
CLP 911.629427
CNY 7.054505
CNH 7.041445
COP 3801.6
CRC 500.023441
CUC 1
CUP 26.5
CVE 93.988535
CZK 20.66805
DJF 178.007927
DKK 6.35678
DOP 63.547132
DZD 129.654932
EGP 47.449851
ERN 15
ETB 156.189388
EUR 0.850931
FJD 2.253797
FKP 0.748248
GBP 0.74691
GEL 2.70203
GGP 0.748248
GHS 11.474844
GIP 0.748248
GMD 73.000007
GNF 8692.206077
GTQ 7.656114
GYD 209.124811
HKD 7.78223
HNL 26.31718
HRK 6.410897
HTG 131.023872
HUF 327.803501
IDR 16673.45
ILS 3.20699
IMP 0.748248
INR 90.72575
IQD 1309.438063
IRR 42122.494452
ISK 126.299846
JEP 0.748248
JMD 160.047735
JOD 0.708952
JPY 154.966501
KES 128.950385
KGS 87.449685
KHR 4002.062831
KMF 419.501996
KPW 899.999687
KRW 1464.35502
KWD 0.30682
KYD 0.833039
KZT 521.320349
LAK 21670.253798
LBP 89512.817781
LKR 308.871226
LRD 176.427969
LSL 16.864406
LTL 2.95274
LVL 0.60489
LYD 5.429826
MAD 9.19607
MDL 16.897807
MGA 4428.248732
MKD 52.4169
MMK 2099.265884
MNT 3545.865278
MOP 8.015428
MRU 40.004433
MUR 45.950131
MVR 15.398937
MWK 1733.36743
MXN 17.978805
MYR 4.0925
MZN 63.910031
NAD 16.864406
NGN 1451.530241
NIO 36.789996
NOK 10.13585
NPR 144.638557
NZD 1.725615
OMR 0.384498
PAB 0.999595
PEN 3.365397
PGK 4.308177
PHP 58.924995
PKR 280.140733
PLN 3.59277
PYG 6714.401398
QAR 3.643004
RON 4.335502
RSD 99.943984
RUB 79.121636
RWF 1454.886417
SAR 3.752081
SBD 8.176752
SCR 14.658273
SDG 601.499594
SEK 9.28439
SGD 1.288906
SHP 0.750259
SLE 24.125013
SLL 20969.503664
SOS 570.259558
SRD 38.547979
STD 20697.981008
STN 20.880385
SVC 8.746351
SYP 11056.681827
SZL 16.85874
THB 31.431503
TJS 9.186183
TMT 3.51
TND 2.922143
TOP 2.40776
TRY 42.701498
TTD 6.783302
TWD 31.318031
TZS 2482.490189
UAH 42.236116
UGX 3552.752147
UYU 39.226383
UZS 12042.534149
VES 267.43975
VND 26320
VUV 121.127634
WST 2.775483
XAF 559.141627
XAG 0.015656
XAU 0.00023
XCD 2.70255
XCG 1.801522
XDR 0.695393
XOF 559.141627
XPF 101.655763
YER 238.499715
ZAR 16.776101
ZMK 9001.197187
ZMW 23.065809
ZWL 321.999592
  • RIO

    -1.0800

    75.66

    -1.43%

  • CMSC

    -0.1300

    23.3

    -0.56%

  • SCS

    0.0200

    16.14

    +0.12%

  • RBGPF

    0.0000

    81.17

    0%

  • BCC

    0.2500

    76.51

    +0.33%

  • RYCEF

    -0.2500

    14.6

    -1.71%

  • BTI

    -1.2700

    57.1

    -2.22%

  • NGG

    0.2400

    74.93

    +0.32%

  • BCE

    0.3100

    23.71

    +1.31%

  • CMSD

    -0.1500

    23.25

    -0.65%

  • RELX

    0.1000

    40.38

    +0.25%

  • GSK

    -0.0700

    48.81

    -0.14%

  • JRI

    -0.0200

    13.7

    -0.15%

  • VOD

    0.0500

    12.59

    +0.4%

  • BP

    -0.2700

    35.26

    -0.77%

  • AZN

    -0.4600

    89.83

    -0.51%

Inner workings of AI an enigma - even to its creators
Inner workings of AI an enigma - even to its creators / Photo: © AFP

Inner workings of AI an enigma - even to its creators

Even the greatest human minds building generative artificial intelligence that is poised to change the world admit they do not comprehend how digital minds think.

Text size:

"People outside the field are often surprised and alarmed to learn that we do not understand how our own AI creations work," Anthropic co-founder Dario Amodei wrote in an essay posted online in April.

"This lack of understanding is essentially unprecedented in the history of technology."

Unlike traditional software programs that follow pre-ordained paths of logic dictated by programmers, generative AI (gen AI) models are trained to find their own way to success once prompted.

In a recent podcast Chris Olah, who was part of ChatGPT-maker OpenAI before joining Anthropic, described gen AI as "scaffolding" on which circuits grow.

Olah is considered an authority in so-called mechanistic interpretability, a method of reverse engineering AI models to figure out how they work.

This science, born about a decade ago, seeks to determine exactly how AI gets from a query to an answer.

"Grasping the entirety of a large language model is an incredibly ambitious task," said Neel Nanda, a senior research scientist at the Google DeepMind AI lab.

It was "somewhat analogous to trying to fully understand the human brain," Nanda added to AFP, noting neuroscientists have yet to succeed on that front.

Delving into digital minds to understand their inner workings has gone from a little-known field just a few years ago to being a hot area of academic study.

"Students are very much attracted to it because they perceive the impact that it can have," said Boston University computer science professor Mark Crovella.

The area of study is also gaining traction due to its potential to make gen AI even more powerful, and because peering into digital brains can be intellectually exciting, the professor added.

- Keeping AI honest -

Mechanistic interpretability involves studying not just results served up by gen AI but scrutinizing calculations performed while the technology mulls queries, according to Crovella.

"You could look into the model...observe the computations that are being performed and try to understand those," the professor explained.

Startup Goodfire uses AI software capable of representing data in the form of reasoning steps to better understand gen AI processing and correct errors.

The tool is also intended to prevent gen AI models from being used maliciously or from deciding on their own to deceive humans about what they are up to.

"It does feel like a race against time to get there before we implement extremely intelligent AI models into the world with no understanding of how they work," said Goodfire chief executive Eric Ho.

In his essay, Amodei said recent progress has made him optimistic that the key to fully deciphering AI will be found within two years.

"I agree that by 2027, we could have interpretability that reliably detects model biases and harmful intentions," said Auburn University associate professor Anh Nguyen.

According to Boston University's Crovella, researchers can already access representations of every digital neuron in AI brains.

"Unlike the human brain, we actually have the equivalent of every neuron instrumented inside these models", the academic said. "Everything that happens inside the model is fully known to us. It's a question of discovering the right way to interrogate that."

Harnessing the inner workings of gen AI minds could clear the way for its adoption in areas where tiny errors can have dramatic consequences, like national security, Amodei said.

For Nanda, better understanding what gen AI is doing could also catapult human discoveries, much like DeepMind's chess-playing AI, AlphaZero, revealed entirely new chess moves that none of the grand masters had ever thought about.

Properly understood, a gen AI model with a stamp of reliability would grab competitive advantage in the market.

Such a breakthrough by a US company would also be a win for the nation in its technology rivalry with China.

"Powerful AI will shape humanity's destiny," Amodei wrote.

"We deserve to understand our own creations before they radically transform our economy, our lives, and our future."

T.Dixon--TFWP