How ’Hey Siri’ Works
- apr
- 16
- Posted by Michael
- Posted in Okategoriserade
How does Apple's personal digital assistant know when you've said 'Hey Siri'? And how does it know you're the one that said it? The answer, of course, is Artificial Intelligence.
Last fall, Apple's Machine Learning Journal began a deep dive into 'Hey, Siri', the voice trigger for the company's personal digital assistant. (See below.) This spring, the Journal is back with another dive into how it tackles not only knowing what is said but who said it, and how it balances imposter acceptance vs. false rejections.
From Apple:
The phrase "Hey Siri" was originally chosen to be as natural as possible; in fact, it was so natural that even before this feature was introduced, users would invoke Siri using the home button and inadvertently prepend their requests with the words, "Hey Siri." Its brevity and ease of articulation, however, bring to bear additional challenges. In particular, our early offline experiments showed, for a reasonable rate of correctly accepted invocations, an unacceptable number of unintended activations. Unintended activations occur in three scenarios - 1) when the primary user says a similar phrase, 2) when other users say "Hey Siri," and 3) when other users say a similar phrase. The last one is the most annoying false activation of all. In an effort to reduce such False Accepts (FA), our work aims to personalize each device such that it (for the most part) only wakes up when the primary user says "Hey Siri." To do so, we leverage techniques from the field of speaker recognition.
It also covers explicit vs. implicit training: Namely, the process at setup and the ongoing process during daily use.
The main design discussion for personalized "Hey Siri" (PHS) revolves around two methods for user enrollment: explicit and implicit. During explicit enrollment, a user is asked to say the target trigger phrase a few times, and the on-device speaker recognition system trains a PHS speaker profile from these utterances. This ensures that every user has a faithfully-trained PHS profile before he or she begins using the "Hey Siri" feature; thus immediately reducing IA rates. However, the recordings typically obtained during the explicit enrollment often contain very little environmental variability. This initial profile is usually created using clean speech, but real-world situations are almost never so ideal.
This brings to bear the notion of implicit enrollment, in which a speaker profile is created over a period of time using the utterances spoken by the primary user. Because these recordings are made in real-world situations, they have the potential to improve the robustness of our speaker profile. The danger, however, lies in the handling of imposter accepts and false alarms; if enough of these get included early on, the resulting profile will be corrupted and not faithfully represent the primary users' voice. The device might begin to falsely reject the primary user's voice or falsely accept other imposters' voices (or both!) and the feature will become useless.
In the previous Apple Machine Learning Journal entry, the team covered how the 'Hey Siri' process itself worked.
A very small speech recognizer runs all the time and listens for just those two words. When it detects "Hey Siri", the rest of Siri parses the following speech as a command or query. The "Hey Siri" detector uses a Deep Neural Network (DNN) to convert the acoustic pattern of your voice at each instant into a probability distribution over speech sounds. It then uses a temporal integration process to compute a confidence score that the phrase you uttered was "Hey Siri". If the score is high enough, Siri wakes up.
As is typical for Apple, it's a process that involves both hardware and software.
The microphone in an iPhone or Apple Watch turns your voice into a stream of instantaneous waveform samples, at a rate of 16000 per second. A spectrum analysis stage converts the waveform sample stream to a sequence of frames, each describing the sound spectrum of approximately 0.01 sec. About twenty of these frames at a time (0.2 sec of audio) are fed to the acoustic model, a Deep Neural Network (DNN) which converts each of these acoustic patterns into a probability distribution over a set of speech sound classes: those used in the "Hey Siri" phrase, plus silence and other speech, for a total of about 20 sound classes.
And yeah, that's right down to the silicon, thanks to an always-on-processor inside the motion co-processor, which is now inside the A-Series system-on-a-chip.
To avoid running the main processor all day just to listen for the trigger phrase, the iPhone's Always On Processor (AOP) (a small, low-power auxiliary processor, that is, the embedded Motion Coprocessor) has access to the microphone signal (on 6S and later). We use a small proportion of the AOP's limited processing power to run a detector with a small version of the acoustic model (DNN). When the score exceeds a threshold the motion coprocessor wakes up the main processor, which analyzes the signal using a larger DNN. In the first versions with AOP support, the first detector used a DNN with 5 layers of 32 hidden units and the second detector had 5 layers of 192 hidden units.
The series is fascinating and I very much hope the team continues to detail it. We're entering an age of ambient computing where we have multiple voice-activated AI assistants not just in our pockets but on our wrists, on our laps and desks, in our living rooms and in our homes.
Voice recognition, voice differentiation, multi-personal assistants, multi-device mesh assistants, and all sorts of new paradigms are growing up and around us to support the technology. All while trying to make sure it stays accessible... and human.
We live in utterly amazing times.
Senaste inläggen
- Rykte: Iphone 17-serien kommer att få en modell med helt ny design
- All the iPads Apple will still announce in 2024
- New M4 iPad Pro owners complain of grainy displays and there probably isn’t anything Apple can do about it
- Are you fed up with AI in your Google Search on iPhone, iPad, or Mac? Here’s how it works, what it does, and how you can avoid it entirely
- Apple Vision Pro is ’one of the biggest steps towards mainstream adoption’ of VR headsets, high-profile game developer says
Senaste kommentarer
Arkiv
- maj 2024
- april 2024
- mars 2024
- februari 2024
- januari 2024
- december 2023
- november 2023
- oktober 2023
- september 2023
- augusti 2023
- juli 2023
- juni 2023
- maj 2023
- april 2023
- mars 2023
- februari 2023
- januari 2023
- december 2022
- november 2022
- oktober 2022
- september 2022
- augusti 2022
- juli 2022
- juni 2022
- maj 2022
- april 2022
- mars 2022
- februari 2022
- april 2021
- mars 2021
- januari 2021
- december 2020
- november 2020
- oktober 2020
- september 2020
- augusti 2020
- juli 2020
- juni 2020
- maj 2020
- april 2020
- mars 2020
- februari 2020
- januari 2020
- december 2019
- november 2019
- oktober 2019
- september 2019
- augusti 2019
- juli 2019
- juni 2019
- maj 2019
- april 2019
- mars 2019
- februari 2019
- januari 2019
- december 2018
- november 2018
- oktober 2018
- september 2018
- augusti 2018
- juli 2018
- juni 2018
- maj 2018
- april 2018
- mars 2018
- februari 2018
- januari 2018
- december 2017
- november 2017
- oktober 2017
- september 2017
- augusti 2017
- juli 2017
- juni 2017
- maj 2017
- april 2017
- mars 2017
- februari 2017
- januari 2017
- december 2016
- november 2016
- oktober 2016
- september 2016
- augusti 2016
- juli 2016
- juni 2016
- maj 2016
- april 2016
- mars 2016
- februari 2016
- januari 2016
- december 2015
- november 2015
- oktober 2015
- september 2015
- augusti 2015
- juli 2015
- juni 2015
- maj 2015
- april 2015
- mars 2015
- februari 2015
- januari 2015
- december 2014
- november 2014
- oktober 2014
- september 2014
- augusti 2014
- juli 2014
- juni 2014
- maj 2014
- april 2014
- mars 2014
- februari 2014
- januari 2014
Kategorier
- –> Publicera på PFA löp
- (PRODUCT) RED
- 2015
- 25PP
- 2nd gen
- 32gb
- 3D Touch
- 3D-kamera
- 4k
- 64gb
- 9to5mac
- A10
- A9X
- Aaron Sorkin
- Accessories
- adapter
- AirPlay
- AirPods
- Aktiv
- Aktivitetsarmband
- Aktuellt
- Alfred
- AMOLED
- Android Wear
- Angela Ahrendts
- Ångerätt
- announcements
- Ansiktsigenkänning
- app
- App Store
- Appar
- Apple
- Apple Beta Software Program
- Apple Book
- Apple CarPlay
- Apple Event
- Apple iMac
- Apple Mac Mini
- Apple Macbook
- Apple MacBook Air
- Apple MacBook Pro
- Apple Macos
- Apple Maps
- Apple Music
- Apple Music Festival
- Apple Music Radio
- Apple Offer
- Apple Online Store
- Apple Park
- Apple Pay
- Apple Pencil
- Apple Podcast
- Apple Store
- Apple Store 3.3
- Apple TV
- apple tv 4
- Apple TV 4K
- Apple Watch
- Apple Watch 2
- Apple Watch 8
- Apple Watch 9
- Apple Watch Apps
- Apple Watch SE
- Apple Watch Series 2
- Apple Watch Sport
- Apple Watch Ultra
- AppleCare
- AppleTV
- Application
- Applications
- Apps
- AppStore
- Apptillägg
- Apptips
- AppTV
- April
- Arbetsminne
- armband
- Art Apps
- Återköp
- återvinning
- Åtgärdsalternativ
- atvflash
- Audio Apps
- Augmented REality
- Back-to-school
- Bakgrundsbilder
- BankId
- Barn
- Batteri
- batteriskal
- batteritid
- Beats
- Beats 1
- Beats Solo 2 Wireless
- Beats Solo2
- Bebis
- Beginner Tips
- Belkin
- Bendgate
- beta
- Beta 3
- betaversion
- betaversioner
- bilddagboken.se
- bilder
- bilhållare
- billboard
- Bioteknik
- Blendtec
- Bloomberg
- Bloons TD 5
- Bluelounge
- Bluetooth
- Böj
- Booking.com
- Borderlinx
- bose
- bugg
- Buggar
- Buggfixar
- Butik
- C More
- Calc 2M
- Camera
- Campus 2
- Canal Digital
- Carpool Karaoke
- Caseual
- Catalyst
- CES 2015
- Chassit
- Chip
- Chrome Remote Desktop
- Chromecast
- citrix
- clic 360
- CNBC
- Connect
- Cydia
- Dagens app
- Dagens tips
- Damm
- Danny Boyle
- Data
- datamängd
- Datorer
- Datortillbehör
- Datum
- Defense
- Dekaler
- Designed by Apple in California
- Developer
- Development
- Digital Inn
- Digital Touch
- Digitalbox
- DigiTimes
- Direkt
- Discover
- display
- DisplayMate
- Dive
- Docka
- Dräger 3000
- Dropbox
- Droples
- DxOMark
- E-post
- earpod
- EarPods
- Earth Day
- Eddie Cue
- eddy cue
- Educational Apps
- Ekonomi
- Ekonomi/Bransch
- El Capitan
- Elements
- ElevationLab
- Elgato Eve
- Elgato Eve Energy
- EM 2016
- Emoji
- emojis
- emoticons
- Enligt
- EU
- event
- Eventrykten
- EverythingApplePro
- Faceshift
- facetime
- Fäste
- Featured
- Features
- Feng
- Film / Tv-serier
- Filmer
- Filstorlek
- Finance Apps
- Finder For AirPods
- Finland
- FireCore
- Fitbit
- Fitness Accessories
- Fjärrstyr
- Flurry
- Födelsedag
- fodral
- Förboka
- Force Touch
- förhandsboka
- Första intryck
- Forumtipset
- foto
- FoU (Forskning och Utveckling)
- Fource Touch
- Foxconn
- FPS Games
- Framtid
- Fre Power
- Frontpage
- Fullt
- Fuse Chicken
- Fyra
- Gadgets
- Gagatsvart
- Gamereactor
- Games
- Gaming
- Gaming Chairs
- Gästkrönika
- General
- Gigaset
- Gitarr
- Glas
- GM
- Google Maps
- Google Now
- gratis
- grattis
- Guide
- Guider
- Guider & listor
- Guld
- hack
- Halebop
- hållare
- Hälsa
- Hårdvara
- HBO
- HBO Nordic
- Health
- Health and Fitness
- Health and Fitness Apps
- Hej Siri
- Helvetica Neue
- Hemelektronik
- Hemknapp
- Hemlarm
- Hermes
- Hitta min iphone
- Hjärta
- högtalare
- HomeKit
- HomePod
- hörlurar
- htc
- Hue
- Humor
- i
- I Am A Witness
- IBM
- iBolt
- iBomber
- iBook
- icar
- iCloud
- iCloud Drive
- iCloud Voicemail
- iCloud.com
- iDevices
- IDG Play
- idownloadblog
- iFixit
- ikea
- iKörkort
- iLife
- Illusion Labs
- iMac
- IMAP
- iMessage
- iMessages
- iMore Show
- Incipio
- InFuse
- Inspelning
- Instagram-flöde
- Instrument
- Intel
- Internet/Webbtjänster
- iOS
- iOS 10
- iOS 12
- iOS 17
- iOS 18
- iOS 5
- iOS 7
- iOS 8
- iOS 8 beta
- iOS 8.1.3
- iOS 8.2
- iOS 8.3
- iOS 8.4
- iOS 8.4.1
- iOS 9
- iOS 9 beta 4
- iOS 9.1
- iOS 9.1 beta 2
- iOS 9.2
- iOS 9.2.1
- iOS 9.3
- IOS Games
- ios uppdatering
- ios9
- iPad
- iPad Accessories
- iPad Air
- iPad Air 2
- iPad Air 3
- iPad Air 5
- iPad Apps
- iPad Mini
- iPad mini 4
- iPad Mini 6
- iPad mini retina
- iPad Pro
- iPados
- iphone
- iPhone 12
- iPhone 14
- iPhone 14 Pro
- iPhone 15
- iPhone 16
- iPhone 17
- iPhone 5
- iPhone 5S
- iPhone 5se
- iPhone 6
- iphone 6 plus
- iPhone 6c
- iPhone 6s
- iPhone 6S plus
- iPhone 7
- iPhone 7 display
- iPhone 7 Plus
- iPhone 7s
- iPhone Accessories
- iPhone Apps
- iPhone SE
- iphone x
- iPhone XS
- iPhone XS Max
- iPhone7
- iPhoneGuiden
- iPhoneguiden.se
- iPhones
- iPod
- iPod Nano
- iPod shuffle
- ipod touch
- iSight
- iTunes
- iWatch
- iWork
- iWork för iCloud beta
- Jailbreak
- James Corden
- Jämförande test
- Jämförelse
- Jet Black
- Jet White
- Jönssonligan
- Jony Ive
- Juice Pack
- Juridik
- Just mobile
- kalender
- kalkylator
- Kamera
- Kameratest
- Karriär/Utbildning
- Kartor
- Kevin Hart
- keynote
- Keynote 2016
- KGI
- KGI Security
- Kina
- Klassiskt läderspänne
- Kod
- Kollage
- koncept
- konceptbilder
- köpguide
- krasch
- Krascha iPhone
- Krönika
- Kvartalsrapport
- Laddhållare
- laddningsdocka
- Laddunderlägg
- läderloop
- lagar
- Lagring
- Lajka
- Länder
- lansering
- laserfokus
- Layout
- leather loop
- LG
- Liam
- Lifeproof
- Lightnigport
- lightning
- Linux
- LinX
- live
- Live GIF
- Live Photos
- Live-event
- Livsstil
- Ljud & Bild
- Logitech
- LOL
- Lösenkod
- Lösenkodlås
- Lovande spel
- LTE
- Luxe Edition
- M3
- M3TV
- Mac
- Mac App Store
- Mac Apps
- Mac Mini
- Mac OS
- Mac OS X
- Mac OS X (generellt)
- Mac OS X Snow Leopard
- Mac Pro
- Macbook
- Macbook Air
- Macbook Pro
- Macforum
- Macintosh
- macOS
- Macs
- MacWorld
- Made for Apple Watch
- magi
- Magic
- MagSafe
- Martin Hajek
- matematik
- Meddelanden
- Media Markt
- Medieproduktion
- Mediocre
- Messaging Apps
- Messenger
- MetaWatch
- Mfi
- Michael Fassbender
- microsoft
- Mikrofon
- Minecraft
- Ming-Chi Kuo
- miniräknare
- minne
- Mixer
- Mixning
- Mjukvara
- mobbning
- Mobile Content
- Mobilt
- Mobilt/Handdator/Laptop
- Mobiltelefon
- Mockup
- Mophie
- mors dag
- moto 360
- Motor
- MTV VMA
- multitasking
- Music
- Music Apps
- Music, Movies and TV
- Musik
- Musikmemon
- MW Expo 2008
- native union
- Nätverk
- Navigation Apps
- nedgradera
- Netatmo Welcome
- Netflix
- Netgear Arlo
- News
- Niantic
- Nike
- Nikkei
- Nintendo
- Nöje
- Norge
- Notis
- Notiscenter
- nya färger
- Nyfödd
- Nyheter
- Officeprogram
- Okategoriserade
- OLED
- omdöme
- Omsättning
- OS X
- OS X El Capitan
- OS X Mavericks
- OS X Yosemite
- Outlook
- Övrig mjukvara
- Övrigt
- PanGu
- papper
- patent
- PC
- pebble
- Pebble Smartwatch
- Pebble Steel
- Pebble Time
- Pebble Time Steel
- Persondatorer
- Petter Hegevall
- PewDiePie
- Philips
- Philips Hue
- Phones
- Photoshop
- Planet of the apps
- Plex
- Pluggar
- Plus
- Plusbox
- Podcast
- Podcast Apps
- Pokemon
- Pokemon Go
- Policy
- Porträttläge
- PP
- Pris
- priser
- problem
- Problems
- Productivity Apps
- Program
- Prylar & tillbehör
- Publik
- publik beta
- QuickTime
- räkenskapsår
- räkna
- ram
- RAM-minne
- Rapport/Undersökning/Trend
- Rea
- Reading Apps
- recension
- Red
- reklaamfilm
- reklam
- reklamfilm
- reklamfilmer
- rekord
- Rendering
- reparation
- Reportage
- Reptest
- ResearchKit
- Retro
- Review
- Ring
- Ringa
- Rocket Cars
- Rosa
- Rumors
- Rumours
- RunKeeper
- rykte
- Rykten
- Safir
- Säkerhet
- Säkerhetsbrist
- Samhälle/Politik
- samsung
- Samtal
- San Francisco
- SAP
- security
- Series 2
- Servrar
- Shigeru Miyamoto
- Sia
- Siri
- SJ Min resa
- skal
- Skal iPhone 6
- skal iPhone 6s
- skärm
- SKärmdump
- Skärmglas
- Skribent
- skribenter medarbetare
- Skriva ut
- skruvmejsel
- skydd
- Skyddsfilm
- Skype
- slice intelligence
- Smart
- smart hem
- Smart Home
- Smart Keyboard
- Smart klocka
- Smart Lights
- smartphone
- Smartwatch
- Snabbt
- Snapchat
- Social Apps
- Software
- Solo2
- sommar
- Sonos
- Sony
- soundtouch
- Space Marshals
- spår
- Speakers
- Special Event
- Spel
- Spelkonsol
- Spellistor
- Split Screen
- Split View
- Sport
- Sportband
- Sports Apps
- spotify
- Spring forward
- Statistik
- Steve Jobs
- Stickers
- Stockholm
- Stor iPhone
- Storlek
- Story Mode
- Strategy Games
- streama
- Streaming
- stresstest
- Ström
- Studentrabatt
- stylus
- Super Mario Run
- support
- Surf
- Surfplatta
- svenska
- sverige
- Sverigelansering
- Switch
- Systemstatus
- Systemutveckling
- tåg
- Taig
- Tangentbord
- Taptic Engine
- Tårta
- tät
- Tävling
- Taylor Swift
- Teknik
- tele 2
- Telefoner
- Telekom
- Telia
- Test
- Tid
- TikTok
- Tile
- tillbehör
- Tim Cook
- TIME
- TimeStand
- Tiny Umbrella
- Tips
- Toppnyhet IDG.se
- Touch ID
- TouchID
- tower defence
- trådlös laddning
- Trådlösa hörlurar
- trådlöst
- trailer
- Travel Apps
- Tre
- TrendForce
- TripAdvisor
- Trolleri
- trump
- TSMC
- Tum
- tv
- tvätta
- tvOS
- tvOS 9.2
- tvOS beta 2
- Tweak
- Typsnitt
- Ubytesprogram
- UE MegaBoom
- Unboxing
- Underhållning/Spel
- unidays
- United Daily News
- Unix
- Updates
- Uppdatera
- uppdatering
- Upplösning
- upptäckt
- USA
- Ut på Twitter
- utbyte
- utbytesprogram
- Utilities Apps
- Utlottning
- utrymme
- utvecklare
- varumärke
- Vatten
- Vattentålig
- vattentät
- vävt nylon
- Verktyg
- Viaplay
- Vibrator
- video
- Videoartiklar och webb-tv (M3/TW/CS)
- Villkor
- viloknapp
- Virtual Reality
- Virus
- visa
- Vision Pro
- VLC
- Volvo on call
- W1
- Waitrose
- Watch OS
- WatchOS
- WatchOS 2
- watchOS 2.0.1
- watchOS 2.2
- Webbtv (AppTV)
- wi-fi
- Wifi-samtal
- Windows
- Windows 8
- WWDC
- WWDC2015
- yalu
- Youtube
- Zlatan