Название: Cyberphysical Smart Cities Infrastructures
Автор: Группа авторов
Издательство: John Wiley & Sons Limited
Жанр: Физика
isbn: 9781119748328
isbn:
The authors benchmarked these algorithms on PointGoal and RoomGoal tasks and found out that, firstly, the naive feedforward algorithm fails to learn any useful representation and, secondly, in small environments, DFP performs better while in big and more complex environments, UNREAL beat the others.
3.4.2 Habitat
Habitat was designed and built in a way to provide the maximum customizability in terms of the datasets that can be used and how the agents and the environment can be configured. That being said, Habitat works with all the major 3D environment datasets without a problem. Moreover, it is extremely fast in comparison with other simulators. AI2‐THOR and CHALET can get to an fps of roughly 10, MINOS and Gibson can get to around a hundred, and House3D yields 300 fps in the best case, while Habitat is capable of getting up to 10 000 fps. It also provides a more realistic collision model in which if a collision happens, the agent can be moved partially or not at all in the intended direction.
To benchmark Habitat, the owners employed a few naive algorithm baselines, proximal policy optimization (PPO) [81] as the representer of learning algorithms versus ORB‐SLAM2 [82, 83] as the chosen candidate for non‐learning agents, and tested them on the PointGoal Navigation task on Gibson and Matterport3D. They used Success weighted by Path Length (SPL) [84] as the metric for their performance. The PPO agent was tested with different levels of sensors (e.g. no visual sensor, only depth, only RGB, and RGBD) to perform an ablation study and find the proportion in which each sensor helps the progress. SLAM agents were given RGBD sensors in all the episodes.
The authors found out that first, PPO agents with only RGB perform as bad as agents with no visual sensors. Second, all agents perform better and generalize more on Gibson rather than Matterport3D since the size of environments in the latter is bigger. Third, agents with only depth sensors generalize across datasets the best and can achieve the highest SPL. However, most importantly, they realized that unlike what has been mentioned in the previous work, if the PPO agent learns long enough, it will eventually outperform the traditional SLAM pipeline. This finding was only possible because the Habitat simulator was fast enough to train PPO agents for 75 million time steps as opposed to only 5 million time steps in the previous investigations.
3.5 Future of Embodied AI
3.5.1 Higher Intelligence
Consciousness has always been considered as the ultimate characteristic for true intelligence. Qualia [85, 86] is the philosophical view of consciousness and is related to the subjective sensory qualities like “the redness of red” that humans have in their mind. If at some point machines can understand this concept and objectively measure such things, then the ultimate goal can be marked as accomplished.
Robots still struggle at performing a wide spectrum of tasks effortlessly and smoothly, and this is mainly due to actuator technology as currently most electrical motors are used. Advances in artificial muscles and skin sensors that could cover the entire embodiment of the agent would be essential to fully mitigate the human experience in the real world and eventually unlock the desired cognition [87].
3.5.2 Evolution
One more key component for cognition is the ability to grow and evolve over time 88, 90. It is easy to evolve the agent's controller via an evolutionary algorithm, but it is not enough. If we aim to have completely different agents, we might as well give them the ability to evolve in terms of embodiment and the sensors as well. This again requires the abovementioned artificial cell organism to encode different physical attributes in them and flip them slightly over time. Of course, we are far from this to become reality, but it is always good to know the furthermost step that has to be done one day.
3.6 Conclusion
Embodied AI is the field of study that takes us one step closer to the true intelligence. It is a shift from Internet AI toward embodiment intelligence that tries to exploit the multisensory abilities of agents such as vision, hearing, and touch, together with language understanding and reinforcement learning attempts to interact in the real world in a more sensible way. In this chapter, we tried to do a concise review of this field and its current advancements, subfields, and tools expecting that this would help accelerate future researches in this area.
References
1 1 Park, J.H., Younas, M., Arabnia, H.R., and Chilamkurti, N. (2021). Emerging ICT applications and services‐big data, IoT, and cloud computing. International Journal of Communication Systems. https://onlinelibrary.wiley.com/doi/full/10.1002/dac.4668.
2 2 Amini, M.H., Imteaj, A., and Pardalos, P.M. (2020). Interdependent networks: a data science perspective. Patterns 1 100003. https://www.sciencedirect.com/science/article/pii/S2666389920300039.
3 3 Mohammadi, F.G. and Amini, M.H. (2019). Promises of meta‐learning for device‐free human sensing: learn to sense. Proceedings of the 1st ACM International Workshop on Device‐Free Human Sensing, pp. 44–47.
4 4 Amini, M.H., Mohammadi, J., and Kar, S. (2020). Promises of fully distributed optimization for IoT‐based smart city infrastructures. In: Optimization, Learning, and Control for Interdependent Complex Networks, M. Hadi Amini, 15–35. Springer.
5 5 Amini, M.H., Arasteh, H., and Siano, P. (2019). Sustainable smart cities through the lens of complex interdependent infrastructures: panorama and state‐of‐the‐art. In: Sustainable Interdependent Networks II, ( M. Hadi Amini, Kianoosh G. Boroojeni, S. S. Iyengar et al.), 45–68. Springer.
6 6 Iskandaryan, D., Ramos, F., and Trilles, S. (2020). Air quality prediction in smart cities using machine learning technologies based on sensor data: a review. Applied Sciences 10 (7): 2401.
7 7 Batty, M., Axhausen, K.W., Giannotti, F. et al. (2012). Smart cities of the future. The European Physical Journal Special Topics 214 (1): 481–518.
8 8 Deng, J., Dong, W., Socher, R. et al. (2009). ImageNet: A large‐scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp. 248–255.
9 9 Lin, T.‐Y., Maire, M., Belongie, S. et al. (2014). Microsoft COCO: Common objects in context. European Conference on Computer Vision, Springer, pp. 740–755.
10 10 Xiao, J., Hays, J., Ehinger, K.A. et al. (2010). Sun database: large‐scale scene recognition from abbey to zoo. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, pp. 3485–3492.
11 11 Griffin, G., Holub, A., and Perona, P. (2007). Caltech‐256 object category dataset.
12 12 Zhou, B., Lapedriza, A., Xiao, J. et al. (2014). Learning deep features for scene recognition using places database. Advances in Neural Information Processing Systems 27: 487–495.
13 13 Rajpurkar, P., Zhang, J., Lopyrev, K., and Liang, P. (2016). SQuAD: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250.
14 14 Wang, A., Singh, A., Michael, J. et al. (2018). GLUE: A multi‐task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461.
15 15 Zellers, R., Bisk, Y., Schwartz, R., and Choi, Y. (2018). SWAG: A large‐scale adversarial dataset for grounded commonsense inference. arXiv preprint arXiv:1808.05326.
СКАЧАТЬ