Tech vs. Touch: The Evolution of Cosmetic Grading

Episode 7 November 12, 2024 00:23:10
Tech vs. Touch: The Evolution of Cosmetic Grading
Apkudo's Podcast
Tech vs. Touch: The Evolution of Cosmetic Grading

Nov 12 2024 | 00:23:10

/

Show Notes

How does balancing human judgment with robotic precision revolutionize the device grading process in the secondary market? 

In this episode, host Allyson Mitchell, VP of Sustainability at Apkudo, and Asa Gismervik, Apkudo’s Director Of Quality Assurance, discuss the challenges associated with cosmetic grading. The reality is that humans don't agree, which means humans and robots won't agree either. Apkudo is deploying a new alignment process to overcome the inherent limitations of subjective grading techniques to achieve greater accuracy. 
 
The conversation takes a deeper dive into this fascinating world of human-machine collaboration, exploring the pivotal role of blind assessments in ensuring unbiased evaluations. Asa shares his insights into the challenges of aligning human and machine interpretations and the dangers of overfitting algorithms to specific datasets. You’ll hear how large sample runs are crucial in fine-tuning algorithms to bridge the gap between manual and automated grading techniques. Ultimately, Apkudo seeks to establish a unified approach to cosmetic grading that harmonizes the efforts of humans, machines, and supply chain partners, paving the way for a more consistent and reliable secondary market for connected devices. 
 
In this episode: 

Resources: 

Apkudo LinkedIn 
Apkudo
Apkudo Podcast 
 

View Full Transcript

Episode Transcript

[00:00:13] Speaker A: Welcome to abcudo's new podcast, a series about trends in the connected device industry where I talk to players up and down the supply chain to uncover insights and industry trends and explore customer journeys, sustainability topics and much more. My name is Allison Mitchell and I'm the VP of Sustainability at AppCudo and the host of this new podcast program. If you have questions or topics of interest, guests you want to hear, email me@allison mitchellcudo.com and we'll be sure to feature those topics on the podcast in future episodes. Our guest for today's podcast is Asa Gizmirvik. He's the Director of Quality Management at abcudo. Welcome, Asa. [00:00:51] Speaker B: Hi. Thanks for having me on the podcast. I'm very excited. [00:00:55] Speaker A: Well, it's great to have you and I'm excited for you to share with our audience today on the topic of cosmetic grading for connected devices, which is happens after they've been recovered from their first life and it's what gets them ready for their second life or their journey into the secondary market. So we'll get into the roles of humans in robotics in this process, the importance of closing the accuracy gap to achieve alignment on the grade of the devices and the impact that our approach to this is making for our customers. But before we dive into today's topic, Asa, I'd love to hear a little bit more from you about your role at AppCuda. You've been with the company for several years and over that time your role has evolved quite a bit, right? [00:01:39] Speaker B: Yeah, absolutely. So I've been here for 13 years. This year would be December, I think is the official date. Last I checked, the third and employee, which is pretty, pretty early on I started I was working on the certification and analysis team doing testing of mobile devices like pre launch devices and that sort of thing. Did a lot of analysis of connected devices, mobile devices before moving into Director of Quality Assurance and then my current position of Director of Quality Management focused on sort of our strategic quality and our overall quality goals. [00:02:11] Speaker A: Well, thanks for sharing that, Asa. And it's always fun to talk to folks who've been with appcuto since the early days, as you have seen quite a bit of the evolution of Appcuto, and not just Appcudo's 13 years, but 13 years in the connected device industry, which is pretty impressive because the industry itself has undergone quite a bit of evolution in that time. So I'm just thrilled to have you here and hear from your experience and your perspective on this topic. So let's start off by having you explain what we mean by grading for connected devices. [00:02:46] Speaker B: So when we talk about grading, there's two primary factors that determine resale value of phones. So I think secondary market, I call them pre loved. But functional state is number one, which most people would equate to does it work? Then there is cosmetic state, which most people would equate to what does it look like? And cosmetic state is qualified with a range of what we call grades. And it goes from grade, it can depend on the customer, but it can go from grade A or new down lower and lower like B, C and D where D is heavy cosmetic damage. And cosmetic grading is the process that determines the cosmetic state. [00:03:26] Speaker A: Well, that's interesting. You think of grading and sometimes you might just think of it all lumped together. But I appreciate knowing that it's thought about from the functional side and the cosmetic side. So people might be unfamiliar with that term, cosmetic grading. Can you define that a little bit? [00:03:42] Speaker B: Yeah. So it's again, that's that process that determines the cosmetic state of the device. It's how we associate a letter grade to a device, bucketing, bucketing them into different categories. So new devices or grade A devices might have a different purpose in the secondary market than grade D devices. So cosmetic grading can be done by both people. It can be done by machines or robots. And there's generally some options there. Right. Like when you determine that grade, you either get it right, you over grade it, or you undergrade it. [00:04:12] Speaker A: What do you mean by over grade or undergrad? [00:04:15] Speaker B: Overgrading is when you have a device that's in a better condition than the actual state of the device. So imagine you're a consumer, you're buying a phone online, you order a device that is advertised to be pre loved like new, and you find out that it is actually worse. Undergrading is worse condition than the actual state. Both of these have different impacts. So overgrading is going to result in frustrating consumers and higher consumer support costs. As they go back and they talk to you and say like, hey, this is not what I purchased. Undergrading is lost value to you, the seller. [00:04:56] Speaker A: Yeah. Leaving money on the table essentially with that. Gotcha. Well, you mentioned the use of both people and robotics in this process. What are the humans doing, what are the robots doing and how are they working together? [00:05:10] Speaker B: Depending on who's doing the cosmetic grading, people or robot people are generally given a written guideline. There's some standard and that's often customer specific or industry specific where they state, this is the lighting conditions in which you have to assess a device. This is the equipment you have to use. Maybe you have a magnifying glass, maybe you have a ruler to measure the length of a defect. There are instructions provided, such as how far you have to hold the device from your eyes. There are definitions included in those guidelines that tell you what's a scratch versus what's a crack or what's a dent, what's discoloration versus a watermark, and so on and so forth. With those guidelines, there are four to eight surfaces that people need to look at. So that's like the front, the back, the sides, the corners. There's that predetermined distance that they have to maintain. There is the actual assessment that they're doing of those defects. They have a limited amount of time to perform that assessment. So usually four to six seconds, depending on the surface. And then in that four to six seconds or shortly after, they also have to correlate everything that they just did into and against that standard. From a machine perspective, the person just gets the device ready to be assessed. It enters what we call the cosmetic inspection cells, part of the RFA line, and pictures are taken of the device by rfa. Those pictures are analyzed at different angles and different lighting conditions. And then our algorithms analyze those images to identify the defects, classify them. [00:06:43] Speaker A: So it sounds like there's different levels of precision that can be achieved by humans versus the robotics. And humans can discern things possibly that robots couldn't. Which makes me think that maybe these. Each type of grader, human or robot, has a unique ability to do this right. There are certain things one can do that the other can't do. Is the goal here to eliminate humans, lean into robotics? Is it both? And is it, you know, what, what is really the way to go about this that is most beneficial to our customers? [00:07:21] Speaker B: I definitely don't think we want to eliminate humans. I might get. We might get in trouble for saying that. But what we do want to do is make sure that humans and robots are really working to together. The machine is always going to give you a greater level of detail than a person is. It's going to more objectively tell you what you're seeing. So what type of defect it is. It's going to more objectively tell you how many defects there are, and it's going to tell you how big they are, classify them in a different way. So where I. A person might ballpark the length of a scratch, the machine itself would be able to say, like this is X millimeters, or what a person might say there's many scratches on the screen. We're going to say with RFA explicitly what the number of scratches are. We really want to use people to train the machine, train the robot in intent. Like what is their intent when they are grading a device? How are they interpreting this standard? And we want to teach the robot and have those people teach the robot what their intent is. [00:08:31] Speaker A: Got it. So that's the part that the subjectivity of a human and their ability to really see the bigger picture and understand the why behind this process can be the real benefit to having humans involved in the process. Like, robots can't know anything until a human tells it what to know. So that the distinction and those different abilities working together creates sort of the goal, the intent and the outcome. Well, thanks for sharing that. I think sounds like the best case scenario, though, is when you've got robots and humans kind of coming together and agreeing on the condition of a device. How, how do you achieve that? [00:09:24] Speaker B: Well, that is a pretty big challenge in a perfect world. Absolutely right. The human and the robot agrees and we all go home at the end of the day very happy. The problem is that right now people don't agree with themselves and people don't agree with others, and we have to deal with both of those things. When we train the machine, there's sort of three, I guess, elements, or maybe buckets or metrics as we might call them, that we look at. The first is going to be accuracy. And accuracy is effectively you get the correct result. But as I've just stated, if people don't agree with themselves or people don't agree with others, what is the correct result? Right. We don't actually know in that case because a different person might decide something different. Or if I go back and look at the same device later, I might decide something different then as well. The current processes in place, they assume that people are 100% accurate, and that is just not the case. We've seen that number vary across all of our customer sites. That number is probably closer to 60 to 70% or lower. So we are trying to make something accurate, like using a standard that is not accurate. When we look at the next metric, reproducibility, that's again the same person getting the same results. Most cosmetic processes also assume that people are going to be reproducible. When I take a look at this device on Monday, I'm going to say it's a B. When I do the same thing on Thursday, it's going to also be a B. That is not always the case. Either we see that number is probably closer to 80 or 90%, and it's not 100. The last one, which is probably one of the most critical ones that we are focused on, is agreement. An agreement is different people. We want them to have the same result, right? So if I grade a device and you create a device, we should both agree that it's B and in what we are seeing is that people don't agree with each other. There are often borderline cases, especially when you look at the number of permutations that these written standards have in terms of defect length or surface, or how much time we had, or the lighting was slightly off. Agreement, we've seen that number is close to about 50%. So the goal of people here is we want to have them train the machine to understand kind of what the current limitations are and work together everybody to overcome those limitations. Those limitations again, accuracy, reproducibility, agreement. [00:11:59] Speaker A: Interesting. Yeah, well, so how do you do that and what's that look like in a customer scenario? [00:12:06] Speaker B: It's a lot. It's a very complex topic. So for what we call cosmetic authentication is the first step, we establish a control group. First, we have a set of devices that we select. Device selection can depend by site, but generally we're looking at representative attributes. So model color, cosmetic defects that represent what that site's expected to process. So we don't want to, you know, train the machine on iPhone if the facility only runs Android, for example. So we start with device selection. We try to target about 100 devices for this kind of reference group as our control, control sample group. And for that group of devices that we select, we perform an independent blind assessment with our customers. So they have established a number of qualified manual writers. People have gone through their training processes, people who are familiar with their written standards, people they trust to do these processes. And what we want to do is identify their accuracy, their reproducibility, and their agreement, specifically agreement here. So we have each of them perform their process as written in their standard. On each of these devices individually, they do not talk, we moderate, we make sure there's no cross contamination here, and we remove any indication of what the grade might be. When they're doing that process, they don't know. They only know the IMEI of the device. We ask them to provide us with the result of the device per their standard, the defects that they see by surface, and then an indication of does that defect matter. It's very possible that they would see something that they say, hey, I see a scratch on the Surface. But that scratch does not matter to me. This other defect means once we complete that bind assessment process, we then can look at agreement. We can determine accuracy if these were devices that were run in production before. So if they have some result existing, we can do that comparison. Next part of that cosmetic authentication process is to align on what those results should be. So we are moderating the graders in a joint session where they have the opportunity to discuss their results. And we can ask questions like, hey, you saw a discoloration on the surface and you saw a watermark. What's the difference between those? Can you explain those? And how does that matter? You saw a scratch and you saw a crack. Can you align on that? And we have the ability there to understand what are the defects that they see that they agree on, what are the grades that they agree on those results and do those things that they see matter? When contributing towards those rates, we then capture objective data about those defects when we can, like measuring scratch length and things like that. The goal of that reference group that we're establishing is really to teach the machine human intent. We want to know again what matters and what doesn't. And from this group we can establish reproducibility so we can run the devices multiple times and see if they reproduce the results and a general accuracy. There's a big risk here that is part of this process, which is the risk of overfitting. If we train our algorithms using this data set so closely to the data set, when you use that algorithm on a new device, there's a big chance that it won't be accurate. In other words, we could potentially perfect this group of devices. And if we perfect that group of devices, it will no longer represent what's actually going to happen in a production environment in the real world. [00:15:50] Speaker A: Gotcha. Almost Fine tuning it to the point that it's. [00:15:53] Speaker B: Yeah, yeah, There's a risk of diminishing return there where you start to perfect a device and then by perfecting that device, you have influenced negatively all of the grades of the other devices. [00:16:07] Speaker A: Wow, that's really interesting. What you just described there, I think really is a helpful way to understand the human value in this process, as opposed to obvious value that the robotics provide in the precision and their ability to see objectively these defects. But the humans being able to decide and describe what matters and why, I mean, that hits really at that intent part, which is the customer layer. Right. Of what the customer cares about. And that's where the interaction between what we can do from a technical standpoint and what the customer needs us to do. That feels like a very important point to hit. [00:16:55] Speaker B: That's absolutely correct. The last part of that cosmetic authentication process that I just want to mention is once we do that, we only get accuracy to a certain point. Right. You can get accuracy to the level where you have a general accuracy of this algorithm. Something else needs to happen, which is large sample runs of manually graded devices. And we use those large sample runs to refine the algorithm. And specifically, what these large samples are doing is it's introducing diversity of surfaces and diversity of defects. So we also want to know out of this control group, what else is happening. And we want to understand and refine based on that. Where we start to look at a proxy for accuracy is the distribution, distribution of the machine versus the manual grades, the difference between the robotics and the graders. So if we can reduce that mismatch between people and machine, that's the target. We want to get those groups of A and Bs and Cs and Ds to be the same or similar between those two things, between people and machines. [00:17:57] Speaker A: And that would require a lot of data to do that. And that having those control groups set up to, at multiple sites with multiple customers and covers those types of insights, is that accurate? [00:18:11] Speaker B: Yeah, I think it absolutely is. So there's, there's definitely a significant amount of data needed to train the algorithms and the machines in the right way. [00:18:23] Speaker A: That's awesome. So this seems like, you know, with all this data being processed and having these comparisons between humans and machine, this could have a bigger impact. Right. Beyond just what it will do for individual customers. We're already kind of scaling our minds out here to what, what this bigger impact could be. [00:18:42] Speaker B: So from an impact perspective, you know, I would say one of my goals is to reset the way we look at the accuracy of cosmetic grading across the industry. We need, you know, to move forward as we transition from people to robotics. Both us and our partners need to understand the limit of manual grading and work collaborative, collaboratively with us to overcome. [00:19:08] Speaker A: Yeah. So not only are you working to achieve that alignment between humans and robots in the process, but you also want to achieve this alignment between the different partners in the supply chain around the expectations for creating a standard for cosmetic grading. So agreement and alignment is kind of an overarching theme here. Do you know that you're doing that in the right way? [00:19:37] Speaker B: That's a really good question. So like I said, it's a very complex transition from people to machine. And we try to be very data driven about Everything we do at a. And right now, this is what the data tells us that we have to. We're trying to measure accuracy against a set of data that's not accurate. There's high variation between graders. If a person's not accurate and people don't agree with each other, we can't ever achieve that device by device measurement accuracy and machine accuracy is really going to be when every. Machine accuracy isn't when every single device is accurate. It's when the results align with the intent of the human and the overall trend is there. So while there's no way for us to really see into the future, the data we have right now says we're on the right track. And I think as new data is introduced, we'll have to adjust accordingly. [00:20:29] Speaker A: Yeah, interesting. Well, and you also have brought up a theme that comes up a lot in these podcast conversation, and that is the collaboration that we have with customers. So while we are expanding our capabilities and expanding their capabilities, we're having a conversation about what's that mean for them and how do they best want to use these capabilities. I can imagine that collaborating with customers will provide opportunities and challenges in that process. What gets you excited about the future or about where appcudo is going when it comes to this kind of collaboration? For cosmetic grading, the most exciting thing. [00:21:10] Speaker B: For me is certainly the transition away from customers and into partnerships. Right. I want to work collaboratively with all of our customers through these challenges. I don't want to necessarily solve it in a silo. I think the more collaboration we have with each other, the more collaboration we have between people working with machines. It allows us to align and continuously redefine our objectives together. [00:21:35] Speaker A: Yeah, that's awesome. And I've had the opportunity to sort of see that in real time. Visiting a customer and seeing our production line and seeing Havcudo employees right next to employees of our customers talking about and discussing the impacts of what the machine's doing and what the humans are doing. And, and so that. That really brought that to life for me. And so having this conversation with you about that, all of the inputs that feed into that, the way that you test that, the way you work to improve accuracy, is really interesting to sort of see layers beneath what happens, you know, on the floor in the processing of the devices. So I appreciate you coming on today to talk about this and to learn more about this topic of cosmetic grading. You have a very interesting role at AppCudo and that you're right at the center of the robots and the humans collaboratively working together. So it's been really great to talk with you and understand this topic more. So thank you again for being here, Asa. [00:22:37] Speaker B: Yeah, it's great to be on. Thanks for having me. [00:22:40] Speaker A: Yeah. Well, thanks for listening to today's podcast session. If you have any feedback or ideas or questions about future topics or guests again, email me Allison Mitchellppcudo.com and we'll be sure to tackle that in a future episode. And thanks again for joining and we will see you at the next episode.

Other Episodes

Episode

March 10, 2025 00:13:30
Episode Cover

Simplifying Device Recovery: How a Fortune 100 Company Boosted Sustainability and Compliance

What simple technology integration nudges employees to return their old devices without disrupting their workflow, and how could it save your company thousands in...

Listen

Episode

September 06, 2024 00:38:54
Episode Cover

New Product Introduction: Precision, Preparation, and Predictive Data

Grant Cushny, Senior Director of Technical Delivery Management at Apkudo, takes us through his remarkable journey across continents and industries. In this episode: 0:00...

Listen

Episode 3

June 20, 2024 00:36:04
Episode Cover

Data Matters: Unleashing the Power Behind AI, Automation, and Robotics

Allyson Mitchell and Chad Gottesman, President of Apkudo, explore how effectively leveraging data in the connected device supply chain can help you gain a...

Listen