landscape UI screenshots. We manually examine the reported
12,621 violations instances and adopt a statistical sampling
technique [18] to examine the minimum number MIN of
the rest of 50,829 app UIs without reported violations. The
estimated accuracy has 0.05 error margin at 95% confidence
level and we sample and examine 3,490 app UIs in total. UIS-
Hunter achieves 0.81, 0.90, and 0.87 for the precision, recall
and F1, respectively. Based on the manually validated detec-
tion results, UIS-Hunter detects 7,497 unique UIs that contain
true-positive design violations for the 45 don’t-guidelines of
10 types of UI components. UIS-Hunter detects all violation
examples in Fig. 1. The results indicate 12.3% of the 60,756
app UIs contains more than one confirmed design violation.
In addition, 16% of 7,497 contain more than one violations.
Among the 9,286 apps, 6,699 (72%) apps have more than two
design violations.
B. User Study
To further investigate whether developers can effectively de-
tect UI design smells and how ordinary app users consider the
reported UI design smells, we recruit 5 front-end developers
to distinguish 27 identified violation UIs from total of 40 UIs,
and ask 3 male and 2 female app users to independently rate
the severity of each violation in the 27 UIs. The results can be
concluded that manual detection of UI design smells has lower
overall precision and recall than automated detection and 15
out of 18 guidelines are considered affecting the experience
seriously by majority of users. Five guidelines have recall
less than 0.4. The result indicates developers fail to detect
these five guidelines. For example, “an outlined button’s width
shouldn’t be narrower than the button’s text length” needs to
check whether the button is outlined and then compare lengths.
14 guidelines are considered severe like “don’t mix tabs that
contain only text with tabs that contain only icons”. We
also find some controversial guidelines (i.e., close non-severe
versus severe ratings ratios). For example, booking.com’s
Android app does apply icons to some destinations but not
others in the navigation drawer. Some users think without
icons actually helps to distinguish auxiliary features (e.g., help,
contact us) from main booking-related features.
V. CONCLUSION AND FUTURE WORK
In this paper, we present UIS-Hunter, a tool for automat-
ically detecting UI design smells against a wide range of
design guidelines in Material Design, along with a Material
Design guidelines gallery, a gallery presenting a demographic
study results of Material Design guidelines and instantiating
each guideline with conformance and violation UIs examples.
Through the UIS-Hunter tool, developers can upload their
UI designs in two formats (SVG from design tools and
JSON from UI testing framework) and receive a detailed UI
design smell report. Material Design guidelines gallery allows
designers and developers to search and filter the guidelines and
examples users are interested in and reduce the time cost on
browsing the complex Material Design website. In the future,
we plan to extend UIS-Hunter to support implicit guidelines in
Material Design and de-facto guidelines emerging from real-
world apps, as well as other design systems that describe visual
do/don’t-guidelines for a library of UI components in a similar
vein. We shall also integrate UIS-Hunter with UI design tools
to support just-in-time UI design smell detection.
Acknowledgement. This research was partially sup-
ported by the National Key R&D Program of China (No.
2019YFB1600700), Australian Research Council’s Discovery
Early Career Researcher Award (DECRA) funding scheme
(DE200100021), ARC Discovery grant (DP200100020), and
National Science Foundation of China (No. U20A20173).
REFERENCES
[1] G. Suryanarayana, G. Samarthyam, and T. Sharma, Refactoring for
software design smells: managing technical debt. Morgan Kaufmann,
2014.
[2] “How google material design affects mobile app design,”
2018. [Online]. Available: https://www.businessofapps.com/news/
how-google-material-design-affects-mobile-app-design/
[3] “The designer’s guide to accessibility research,”
2018. [Online]. Available: https://design.google/library/
designers-guide-accessibility-research/
[4] M. Fowler, Refactoring: improving the design of existing code.
Addison-Wesley Professional, 2018.
[5] F. Palomba, G. Bavota, M. Di Penta, R. Oliveto, A. De Lucia, and
D. Poshyvanyk, “Detecting bad smells in source code using change
history information,” in 2013 28th IEEE/ACM International Conference
on Automated Software Engineering (ASE). IEEE, 2013, pp. 268–278.
[6] H. Liu, Q. Liu, Z. Niu, and Y. Liu, “Dynamic and automatic feedback-
based threshold adaptation for code smell detection,” IEEE Transactions
on Software Engineering, vol. 42, no. 6, pp. 544–558, 2015.
[7] “Findbugs,” 2020. [Online]. Available: https://github.com/
findbugsproject/findbugs
[8] “Stylelint,” 2020. [Online]. Available: https://stylelint.io/
[9] K. Moran, B. Li, C. Bernal-Cárdenas, D. Jelf, and D. Poshyvanyk,
“Automated reporting of gui design violations for mobile apps,” in Pro-
ceedings of the 40th International Conference on Software Engineering,
2018, pp. 165–175.
[10] D. Zhao, Z. Xing, C. Chen, X. Xu, L. Zhu, G. Li, and J. Wang, “Seeno-
maly: Vision-based linting of gui animation effects against design-don’t
guidelines,” in 42nd International Conference on Software Engineering
(ICSE’20). ACM, New York, NY, 2020.
[11] Z. Wu, Y. Jiang, Y. Liu, and X. Ma, “Predicting and diagnosing user
engagement with mobile ui animation via a data-driven approach,”
in Proceedings of the 2020 CHI Conference on Human Factors in
Computing Systems, 2020, pp. 1–13.
[12] M. Xie, S. Feng, J. Chen, Z. Xing, and C. Chen, “Uied: A hybrid tool
for gui element detection,” in Proceedings of the 2020 28th ACM Joint
Meeting on European Software Engineering Conference and Symposium
on the Foundations of Software Engineering, 2020.
[13] J. Chen, M. Xie, Z. Xing, C. Chen, X. Xu, and L. Zhu, “Object
detection for graphical user interface: Old fashioned or deep learning
or a combination?” in Proceedings of the 2020 28th ACM Joint Meeting
on European Software Engineering Conference and Symposium on the
Foundations of Software Engineering, 2020.
[14] X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He, and J. Liang,
“East: an efficient and accurate scene text detector,” in Proceedings of
the IEEE conference on Computer Vision and Pattern Recognition, 2017,
pp. 5551–5560.
[15] C. Chen, S. Feng, Z. Xing, L. Liu, S. Zhao, and J. Wang, “Gallery
dc: Design search and knowledge discovery through auto-created gui
component gallery,” Proceedings of the ACM on Human-Computer
Interaction, vol. 3, no. CSCW, pp. 1–22, 2019.
[16] J. Canny, “A computational approach to edge detection,” IEEE Transac-
tions on pattern analysis and machine intelligence, no. 6, pp. 679–698,
1986.
[17] B. Deka, Z. Huang, C. Franzen, J. Hibschman, D. Afergan, Y. Li,
J. Nichols, and R. Kumar, “Rico: A mobile app dataset for building
data-driven design applications,” in Proceedings of the 30th Annual ACM
Symposium on User Interface Software and Technology, 2017, pp. 845–
854.
[18] R. Singh and N. S. Mangat, Elements of survey sampling. Springer
Science & Business Media, 2013, vol. 15.